Predict Bike Sharing Demand with AutoGluon Template¶

Project: Predict Bike Sharing Demand with AutoGluon¶

This notebook is a template with each step that you need to complete for the project.

Please fill in your code where there are explicit ? markers in the notebook. You are welcome to add more cells and code as you see fit.

Once you have completed all the code implementations, please export your notebook as a HTML file so the reviews can view your code. Make sure you have all outputs correctly outputted.

File-> Export Notebook As... -> Export Notebook as HTML

There is a writeup to complete as well after all code implememtation is done. Please answer all questions and attach the necessary tables and charts. You can complete the writeup in either markdown or PDF.

Completing the code template and writeup template will cover all of the rubric points for this project.

The rubric contains "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. The stand out suggestions are optional. If you decide to pursue the "stand out suggestions", you can include the code in this notebook and also discuss the results in the writeup file.

Step 1: Create an account with Kaggle¶

Create Kaggle Account and download API key¶

Below is example of steps to get the API username and key. Each student will have their own username and key.

  1. Open account settings. kaggle1.png kaggle2.png
  2. Scroll down to API and click Create New API Token. kaggle3.png kaggle4.png
  3. Open up kaggle.json and use the username and key. kaggle5.png

Step 2: Download the Kaggle dataset using the kaggle python library¶

Open up Sagemaker Studio and use starter template¶

  1. Notebook should be using a ml.t3.medium instance (2 vCPU + 4 GiB)
  2. Notebook should be using kernal: Python 3 (MXNet 1.8 Python 3.7 CPU Optimized)

Install packages¶

In [1]:
!pip install -U pip
!pip install -U setuptools wheel
!pip install -U "mxnet<2.0.0" bokeh==2.0.1
!pip install autogluon --no-cache-dir
# Without --no-cache-dir, smaller aws instances may have trouble installing
Requirement already satisfied: pip in /opt/conda/lib/python3.10/site-packages (23.3.2)
Collecting pip
  Using cached pip-24.0-py3-none-any.whl.metadata (3.6 kB)
Using cached pip-24.0-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.3.2
    Uninstalling pip-23.3.2:
      Successfully uninstalled pip-23.3.2
Successfully installed pip-24.0
Requirement already satisfied: setuptools in /opt/conda/lib/python3.10/site-packages (69.5.1)
Requirement already satisfied: wheel in /opt/conda/lib/python3.10/site-packages (0.43.0)
Collecting mxnet<2.0.0
  Using cached mxnet-1.9.1-py3-none-manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting bokeh==2.0.1
  Using cached bokeh-2.0.1-py3-none-any.whl
Requirement already satisfied: PyYAML>=3.10 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (6.0.1)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (2.9.0)
Requirement already satisfied: Jinja2>=2.7 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (3.1.3)
Requirement already satisfied: numpy>=1.11.3 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (1.26.4)
Requirement already satisfied: pillow>=4.0 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (9.5.0)
Requirement already satisfied: packaging>=16.8 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (23.2)
Requirement already satisfied: tornado>=5 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (6.4)
Requirement already satisfied: typing-extensions>=3.7.4 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (4.5.0)
Requirement already satisfied: requests<3,>=2.20.0 in /opt/conda/lib/python3.10/site-packages (from mxnet<2.0.0) (2.31.0)
Collecting graphviz<0.9.0,>=0.8.1 (from mxnet<2.0.0)
  Using cached graphviz-0.8.4-py2.py3-none-any.whl.metadata (6.4 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from Jinja2>=2.7->bokeh==2.0.1) (2.1.5)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.1->bokeh==2.0.1) (1.16.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2024.2.2)
Using cached mxnet-1.9.1-py3-none-manylinux2014_x86_64.whl (49.1 MB)
Using cached graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Installing collected packages: graphviz, mxnet, bokeh
  Attempting uninstall: graphviz
    Found existing installation: graphviz 0.20.3
    Uninstalling graphviz-0.20.3:
      Successfully uninstalled graphviz-0.20.3
Successfully installed bokeh-2.0.1 graphviz-0.8.4 mxnet-1.9.1
Requirement already satisfied: autogluon in /opt/conda/lib/python3.10/site-packages (0.8.2)
Requirement already satisfied: autogluon.core==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.core[all]==0.8.2->autogluon) (0.8.2)
Requirement already satisfied: autogluon.features==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon) (0.8.2)
Requirement already satisfied: autogluon.tabular==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (0.8.2)
Requirement already satisfied: autogluon.multimodal==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon) (0.8.2)
Requirement already satisfied: autogluon.timeseries==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries[all]==0.8.2->autogluon) (0.8.2)
Requirement already satisfied: numpy<1.27,>=1.21 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.26.4)
Requirement already satisfied: scipy<1.12,>=1.5.4 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.11.4)
Requirement already satisfied: scikit-learn<1.5,>=1.3.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.4.2)
Requirement already satisfied: networkx<4,>=3.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.3)
Requirement already satisfied: pandas<2.2.0,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2.1.4)
Requirement already satisfied: tqdm<5,>=4.38 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (4.66.2)
Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2.31.0)
Requirement already satisfied: matplotlib in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.8.4)
Requirement already satisfied: boto3<2,>=1.10 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.34.51)
Requirement already satisfied: autogluon.common==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (0.8.2)
Collecting hyperopt<0.2.8,>=0.2.7 (from autogluon.core[all]==0.8.2->autogluon)
  Downloading hyperopt-0.2.7-py2.py3-none-any.whl.metadata (1.7 kB)
Requirement already satisfied: pydantic<2.0,>=1.10.4 in /opt/conda/lib/python3.10/site-packages (from autogluon.core[all]==0.8.2->autogluon) (1.10.14)
Collecting ray<2.7,>=2.6.3 (from ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading ray-2.6.3-cp310-cp310-manylinux2014_x86_64.whl.metadata (12 kB)
Requirement already satisfied: Pillow<9.6,>=9.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (9.5.0)
Requirement already satisfied: torch<2.1,>=1.13 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.0.0.post101)
Requirement already satisfied: pytorch-lightning<2.1,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.0.9)
Requirement already satisfied: jsonschema<4.18,>=4.14 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (4.17.3)
Requirement already satisfied: seqeval<1.3.0,>=1.2.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.2.2)
Requirement already satisfied: evaluate<0.5.0,>=0.4.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.4.1)
Requirement already satisfied: accelerate<0.22.0,>=0.21.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.21.0)
Requirement already satisfied: transformers<4.32.0,>=4.31.0 in /opt/conda/lib/python3.10/site-packages (from transformers[sentencepiece]<4.32.0,>=4.31.0->autogluon.multimodal==0.8.2->autogluon) (4.31.0)
Requirement already satisfied: timm<0.10.0,>=0.9.5 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.9.16)
Requirement already satisfied: torchvision<0.16.0,>=0.14.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.15.2a0+072ec57)
Requirement already satisfied: scikit-image<0.20.0,>=0.19.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.19.3)
Requirement already satisfied: text-unidecode<1.4,>=1.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.3)
Requirement already satisfied: torchmetrics<1.1.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.0.3)
Requirement already satisfied: nptyping<2.5.0,>=1.4.4 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.4.1)
Requirement already satisfied: omegaconf<2.3.0,>=2.1.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.2.3)
Requirement already satisfied: pytorch-metric-learning<2.0,>=1.3.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.7.3)
Requirement already satisfied: nlpaug<1.2.0,>=1.1.10 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.1.11)
Requirement already satisfied: nltk<4.0.0,>=3.4.5 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (3.8.1)
Requirement already satisfied: openmim<0.4.0,>=0.3.7 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.3.7)
Requirement already satisfied: defusedxml<0.7.2,>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.7.1)
Requirement already satisfied: jinja2<3.2,>=3.0.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (3.1.3)
Requirement already satisfied: tensorboard<3,>=2.9 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.12.3)
Requirement already satisfied: pytesseract<0.3.11,>=0.3.9 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.3.10)
Requirement already satisfied: catboost<1.3,>=1.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (1.2.3)
Requirement already satisfied: xgboost<1.8,>=1.6 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (1.7.6)
Requirement already satisfied: fastai<2.8,>=2.3.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (2.7.14)
Requirement already satisfied: lightgbm<3.4,>=3.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (3.3.5)
Requirement already satisfied: joblib<2,>=1.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (1.4.0)
Requirement already satisfied: statsmodels<0.15,>=0.13.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.14.1)
Requirement already satisfied: gluonts<0.14,>=0.13.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.13.7)
Requirement already satisfied: statsforecast<1.5,>=1.4.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (1.4.0)
Requirement already satisfied: mlforecast<0.7.4,>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.7.3)
Requirement already satisfied: ujson<6,>=5 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (5.9.0)
Requirement already satisfied: psutil<6,>=5.7.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.common==0.8.2->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (5.9.8)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.10/site-packages (from autogluon.common==0.8.2->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (69.5.1)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from accelerate<0.22.0,>=0.21.0->autogluon.multimodal==0.8.2->autogluon) (23.2)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages (from accelerate<0.22.0,>=0.21.0->autogluon.multimodal==0.8.2->autogluon) (6.0.1)
Requirement already satisfied: botocore<1.35.0,>=1.34.51 in /opt/conda/lib/python3.10/site-packages (from boto3<2,>=1.10->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.34.51)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from boto3<2,>=1.10->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.0.1)
Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /opt/conda/lib/python3.10/site-packages (from boto3<2,>=1.10->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (0.10.1)
Requirement already satisfied: graphviz in /opt/conda/lib/python3.10/site-packages (from catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (0.8.4)
Requirement already satisfied: plotly in /opt/conda/lib/python3.10/site-packages (from catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (5.19.0)
Requirement already satisfied: six in /opt/conda/lib/python3.10/site-packages (from catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (1.16.0)
Requirement already satisfied: datasets>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (2.18.0)
Requirement already satisfied: dill in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.3.8)
Requirement already satisfied: xxhash in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (3.4.1)
Requirement already satisfied: multiprocess in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.70.16)
Requirement already satisfied: fsspec>=2021.05.0 in /opt/conda/lib/python3.10/site-packages (from fsspec[http]>=2021.05.0->evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (2023.6.0)
Requirement already satisfied: huggingface-hub>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.22.2)
Requirement already satisfied: responses<0.19 in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.18.0)
Requirement already satisfied: pip in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (24.0)
Requirement already satisfied: fastdownload<2,>=0.0.5 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.0.7)
Requirement already satisfied: fastcore<1.6,>=1.5.29 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.5.29)
Requirement already satisfied: fastprogress>=0.2.4 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.0.3)
Requirement already satisfied: spacy<4 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.7.3)
Requirement already satisfied: toolz~=0.10 in /opt/conda/lib/python3.10/site-packages (from gluonts<0.14,>=0.13.1->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.12.1)
Requirement already satisfied: typing-extensions~=4.0 in /opt/conda/lib/python3.10/site-packages (from gluonts<0.14,>=0.13.1->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (4.5.0)
Requirement already satisfied: future in /opt/conda/lib/python3.10/site-packages (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==0.8.2->autogluon) (1.0.0)
Requirement already satisfied: cloudpickle in /opt/conda/lib/python3.10/site-packages (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==0.8.2->autogluon) (2.2.1)
Collecting py4j (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==0.8.2->autogluon)
  Downloading py4j-0.10.9.7-py2.py3-none-any.whl.metadata (1.5 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2<3.2,>=3.0.3->autogluon.multimodal==0.8.2->autogluon) (2.1.5)
Requirement already satisfied: attrs>=17.4.0 in /opt/conda/lib/python3.10/site-packages (from jsonschema<4.18,>=4.14->autogluon.multimodal==0.8.2->autogluon) (23.2.0)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /opt/conda/lib/python3.10/site-packages (from jsonschema<4.18,>=4.14->autogluon.multimodal==0.8.2->autogluon) (0.20.0)
Requirement already satisfied: wheel in /opt/conda/lib/python3.10/site-packages (from lightgbm<3.4,>=3.3->autogluon.tabular[all]==0.8.2->autogluon) (0.43.0)
Requirement already satisfied: numba in /opt/conda/lib/python3.10/site-packages (from mlforecast<0.7.4,>=0.7.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.59.1)
Requirement already satisfied: window-ops in /opt/conda/lib/python3.10/site-packages (from mlforecast<0.7.4,>=0.7.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.0.15)
Requirement already satisfied: gdown>=4.0.0 in /opt/conda/lib/python3.10/site-packages (from nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (5.1.0)
Requirement already satisfied: click in /opt/conda/lib/python3.10/site-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==0.8.2->autogluon) (8.1.7)
Requirement already satisfied: regex>=2021.8.3 in /opt/conda/lib/python3.10/site-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==0.8.2->autogluon) (2023.12.25)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in /opt/conda/lib/python3.10/site-packages (from omegaconf<2.3.0,>=2.1.1->autogluon.multimodal==0.8.2->autogluon) (4.9.3)
Requirement already satisfied: colorama in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.4.6)
Requirement already satisfied: model-index in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.1.11)
Requirement already satisfied: rich in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (13.7.1)
Requirement already satisfied: tabulate in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.9.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.10/site-packages (from pandas<2.2.0,>=2.0.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2.9.0)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.10/site-packages (from pandas<2.2.0,>=2.0.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /opt/conda/lib/python3.10/site-packages (from pandas<2.2.0,>=2.0.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2024.1)
Requirement already satisfied: lightning-utilities>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from pytorch-lightning<2.1,>=2.0.0->autogluon.multimodal==0.8.2->autogluon) (0.11.2)
Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (3.13.4)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.0.7)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (4.21.12)
Requirement already satisfied: aiosignal in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.3.1)
Requirement already satisfied: frozenlist in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.4.1)
Requirement already satisfied: grpcio>=1.42.0 in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.54.3)
Requirement already satisfied: aiohttp>=3.7 in /opt/conda/lib/python3.10/site-packages (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (3.9.3)
Collecting aiohttp-cors (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading aiohttp_cors-0.7.0-py3-none-any.whl.metadata (20 kB)
Collecting colorful (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading colorful-0.5.6-py2.py3-none-any.whl.metadata (16 kB)
Collecting py-spy>=0.2.0 (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading py_spy-0.3.14-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (16 kB)
Collecting gpustat>=1.0.0 (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading gpustat-1.1.1.tar.gz (98 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.1/98.1 kB 39.0 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting opencensus (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading opencensus-0.11.4-py2.py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: prometheus-client>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (0.20.0)
Requirement already satisfied: smart-open in /opt/conda/lib/python3.10/site-packages (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (5.2.1)
Collecting virtualenv<20.21.1,>=20.0.24 (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading virtualenv-20.21.0-py3-none-any.whl.metadata (4.1 kB)
Collecting tensorboardX>=1.9 (from ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl.metadata (5.8 kB)
Requirement already satisfied: pyarrow>=6.0.1 in /opt/conda/lib/python3.10/site-packages (from ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (12.0.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2024.2.2)
Requirement already satisfied: imageio>=2.4.1 in /opt/conda/lib/python3.10/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.8.2->autogluon) (2.34.0)
Requirement already satisfied: tifffile>=2019.7.26 in /opt/conda/lib/python3.10/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.8.2->autogluon) (2024.2.12)
Requirement already satisfied: PyWavelets>=1.1.1 in /opt/conda/lib/python3.10/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.8.2->autogluon) (1.4.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from scikit-learn<1.5,>=1.3.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.4.0)
Requirement already satisfied: patsy>=0.5.4 in /opt/conda/lib/python3.10/site-packages (from statsmodels<0.15,>=0.13.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.5.6)
Requirement already satisfied: absl-py>=0.4 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (2.1.0)
Requirement already satisfied: google-auth<3,>=1.6.3 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (2.29.0)
Requirement already satisfied: google-auth-oauthlib<1.1,>=0.5 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (1.0.0)
Requirement already satisfied: markdown>=2.6.8 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (3.6)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (0.7.0)
Requirement already satisfied: werkzeug>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (3.0.2)
Requirement already satisfied: safetensors in /opt/conda/lib/python3.10/site-packages (from timm<0.10.0,>=0.9.5->autogluon.multimodal==0.8.2->autogluon) (0.4.2)
Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch<2.1,>=1.13->autogluon.multimodal==0.8.2->autogluon) (1.12)
Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /opt/conda/lib/python3.10/site-packages (from transformers<4.32.0,>=4.31.0->transformers[sentencepiece]<4.32.0,>=4.31.0->autogluon.multimodal==0.8.2->autogluon) (0.13.3)
Requirement already satisfied: sentencepiece!=0.1.92,>=0.1.91 in /opt/conda/lib/python3.10/site-packages (from transformers[sentencepiece]<4.32.0,>=4.31.0->autogluon.multimodal==0.8.2->autogluon) (0.1.99)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.2.1)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (4.51.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.4.5)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.1.2)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (4.0.3)
Requirement already satisfied: pyarrow-hotfix in /opt/conda/lib/python3.10/site-packages (from datasets>=2.0.0->evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.6)
Requirement already satisfied: beautifulsoup4 in /opt/conda/lib/python3.10/site-packages (from gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (4.12.3)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (5.3.3)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/conda/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /opt/conda/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from google-auth-oauthlib<1.1,>=0.5->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (2.0.0)
Collecting nvidia-ml-py>=11.450.129 (from gpustat>=1.0.0->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading nvidia_ml_py-12.550.52-py3-none-any.whl.metadata (8.6 kB)
Collecting blessed>=1.17.1 (from gpustat>=1.0.0->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading blessed-1.20.0-py2.py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: llvmlite<0.43,>=0.42.0dev0 in /opt/conda/lib/python3.10/site-packages (from numba->mlforecast<0.7.4,>=0.7.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.42.0)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (8.2.2)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.1.2)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (2.0.10)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.3.4)
Requirement already satisfied: typer<0.10.0,>=0.3.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.9.4)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.3.0)
Collecting distlib<1,>=0.3.6 (from virtualenv<20.21.1,>=20.0.24->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading distlib-0.3.8-py2.py3-none-any.whl.metadata (5.1 kB)
Collecting platformdirs<4,>=2.4 (from virtualenv<20.21.1,>=20.0.24->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading platformdirs-3.11.0-py3-none-any.whl.metadata (11 kB)
Requirement already satisfied: ordered-set in /opt/conda/lib/python3.10/site-packages (from model-index->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (4.1.0)
Collecting opencensus-context>=0.1.3 (from opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading opencensus_context-0.1.3-py2.py3-none-any.whl.metadata (3.3 kB)
Collecting google-api-core<3.0.0,>=1.0.0 (from opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading google_api_core-2.18.0-py3-none-any.whl.metadata (2.7 kB)
Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/lib/python3.10/site-packages (from plotly->catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (8.2.3)
Requirement already satisfied: markdown-it-py>=2.2.0 in /opt/conda/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /opt/conda/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (2.17.2)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch<2.1,>=1.13->autogluon.multimodal==0.8.2->autogluon) (1.3.0)
Requirement already satisfied: wcwidth>=0.1.4 in /opt/conda/lib/python3.10/site-packages (from blessed>=1.17.1->gpustat>=1.0.0->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (0.2.13)
Collecting googleapis-common-protos<2.0.dev0,>=1.56.2 (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading googleapis_common_protos-1.63.0-py2.py3-none-any.whl.metadata (1.5 kB)
Collecting proto-plus<2.0.0dev,>=1.22.3 (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
  Downloading proto_plus-1.23.0-py3-none-any.whl.metadata (2.2 kB)
Requirement already satisfied: mdurl~=0.1 in /opt/conda/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.1.2)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /opt/conda/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (0.5.1)
Requirement already satisfied: oauthlib>=3.0.0 in /opt/conda/lib/python3.10/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (3.2.2)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /opt/conda/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.7.10)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /opt/conda/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.1.4)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from weasel<0.4.0,>=0.1.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.16.0)
Requirement already satisfied: soupsieve>1.2 in /opt/conda/lib/python3.10/site-packages (from beautifulsoup4->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (2.5)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /opt/conda/lib/python3.10/site-packages (from requests[socks]->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (1.7.1)
Downloading hyperopt-0.2.7-py2.py3-none-any.whl (1.6 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 208.7 MB/s eta 0:00:00
Downloading ray-2.6.3-cp310-cp310-manylinux2014_x86_64.whl (56.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.9/56.9 MB 134.9 MB/s eta 0:00:00a 0:00:01
Downloading py_spy-0.3.14-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 265.5 MB/s eta 0:00:00
Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl (101 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 101.7/101.7 kB 275.0 MB/s eta 0:00:00
Downloading virtualenv-20.21.0-py3-none-any.whl (8.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.7/8.7 MB 242.7 MB/s eta 0:00:0000:01
Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB)
Downloading colorful-0.5.6-py2.py3-none-any.whl (201 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 201.4/201.4 kB 366.4 MB/s eta 0:00:00
Downloading opencensus-0.11.4-py2.py3-none-any.whl (128 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 128.2/128.2 kB 269.9 MB/s eta 0:00:00
Downloading py4j-0.10.9.7-py2.py3-none-any.whl (200 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 200.5/200.5 kB 377.0 MB/s eta 0:00:00
Downloading blessed-1.20.0-py2.py3-none-any.whl (58 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.4/58.4 kB 247.0 MB/s eta 0:00:00
Downloading distlib-0.3.8-py2.py3-none-any.whl (468 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.9/468.9 kB 343.2 MB/s eta 0:00:00
Downloading google_api_core-2.18.0-py3-none-any.whl (138 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 138.3/138.3 kB 290.7 MB/s eta 0:00:00
Downloading nvidia_ml_py-12.550.52-py3-none-any.whl (39 kB)
Downloading opencensus_context-0.1.3-py2.py3-none-any.whl (5.1 kB)
Downloading platformdirs-3.11.0-py3-none-any.whl (17 kB)
Downloading googleapis_common_protos-1.63.0-py2.py3-none-any.whl (229 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.1/229.1 kB 305.2 MB/s eta 0:00:00
Downloading proto_plus-1.23.0-py3-none-any.whl (48 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.8/48.8 kB 265.0 MB/s eta 0:00:00
Building wheels for collected packages: gpustat
  Building wheel for gpustat (pyproject.toml) ... done
  Created wheel for gpustat: filename=gpustat-1.1.1-py3-none-any.whl size=26532 sha256=aa4b8348de44c46abea7798d3ae46f8adb523c049aa19094e762fbcfe14ab2fc
  Stored in directory: /tmp/pip-ephem-wheel-cache-sgjyrp8c/wheels/ec/d7/80/a71ba3540900e1f276bcae685efd8e590c810d2108b95f1e47
Successfully built gpustat
Installing collected packages: py4j, py-spy, opencensus-context, nvidia-ml-py, distlib, colorful, tensorboardX, proto-plus, platformdirs, googleapis-common-protos, blessed, virtualenv, ray, hyperopt, gpustat, google-api-core, aiohttp-cors, opencensus
  Attempting uninstall: platformdirs
    Found existing installation: platformdirs 4.2.0
    Uninstalling platformdirs-4.2.0:
      Successfully uninstalled platformdirs-4.2.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sparkmagic 0.21.0 requires pandas<2.0.0,>=0.17.1, but you have pandas 2.1.4 which is incompatible.
Successfully installed aiohttp-cors-0.7.0 blessed-1.20.0 colorful-0.5.6 distlib-0.3.8 google-api-core-2.18.0 googleapis-common-protos-1.63.0 gpustat-1.1.1 hyperopt-0.2.7 nvidia-ml-py-12.550.52 opencensus-0.11.4 opencensus-context-0.1.3 platformdirs-3.11.0 proto-plus-1.23.0 py-spy-0.3.14 py4j-0.10.9.7 ray-2.6.3 tensorboardX-2.6.2.2 virtualenv-20.21.0
Requirement already satisfied: gdown>=4.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.1.0->autogluon) (5.1.0)
Requirement already satisfied: click in /usr/lib/python3/dist-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==1.1.0->autogluon) (8.0.3)
Requirement already satisfied: regex>=2021.8.3 in /usr/lib/python3/dist-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==1.1.0->autogluon) (2021.11.10)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in /home/zafar/.local/lib/python3.10/site-packages (from omegaconf<2.3.0,>=2.1.1->autogluon.multimodal==1.1.0->autogluon) (4.9.3)
Requirement already satisfied: colorama in /usr/lib/python3/dist-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.4.4)
Requirement already satisfied: model-index in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.1.11)
Requirement already satisfied: opendatalab in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.0.10)
Requirement already satisfied: rich in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (13.4.2)
Requirement already satisfied: tabulate in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.9.0)
Requirement already satisfied: coloredlogs in /home/zafar/.local/lib/python3.10/site-packages (from optimum<1.19,>=1.17->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (15.0.1)
Requirement already satisfied: sympy in /usr/lib/python3/dist-packages (from optimum<1.19,>=1.17->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (1.9)
Requirement already satisfied: onnx in /home/zafar/.local/lib/python3.10/site-packages (from optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (1.16.0)
Requirement already satisfied: onnxruntime>=1.11.0 in /home/zafar/.local/lib/python3.10/site-packages (from optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (1.17.3)
Requirement already satisfied: protobuf>=3.20.1 in /home/zafar/.local/lib/python3.10/site-packages (from optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (4.25.3)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/zafar/.local/lib/python3.10/site-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /home/zafar/.local/lib/python3.10/site-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2023.4)
Requirement already satisfied: tzdata>=2022.7 in /home/zafar/.local/lib/python3.10/site-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2024.1)
Requirement already satisfied: filelock in /home/zafar/.local/lib/python3.10/site-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (3.13.4)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/lib/python3/dist-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.0.3)
Requirement already satisfied: aiosignal in /home/zafar/.local/lib/python3.10/site-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.3.1)
Requirement already satisfied: frozenlist in /home/zafar/.local/lib/python3.10/site-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.4.1)
Requirement already satisfied: aiohttp>=3.7 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (3.9.5)
Requirement already satisfied: aiohttp-cors in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.7.0)
Requirement already satisfied: colorful in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.5.6)
Requirement already satisfied: py-spy>=0.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.3.14)
Requirement already satisfied: opencensus in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.11.4)
Requirement already satisfied: prometheus-client>=0.7.1 in /usr/lib/python3/dist-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.9.0)
Requirement already satisfied: smart-open in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (6.4.0)
Requirement already satisfied: virtualenv!=20.21.1,>=20.0.24 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (20.26.0)
Requirement already satisfied: grpcio>=1.42.0 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.62.2)
Requirement already satisfied: tensorboardX>=1.9 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (2.6.2.2)
Requirement already satisfied: pyarrow>=6.0.1 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (16.0.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/zafar/.local/lib/python3.10/site-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (1.26.5)
Requirement already satisfied: certifi>=2017.4.17 in /home/zafar/.local/lib/python3.10/site-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2024.2.2)
Requirement already satisfied: imageio>=2.4.1 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (2.34.1)
Requirement already satisfied: tifffile>=2019.7.26 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (2024.4.24)
Requirement already satisfied: PyWavelets>=1.1.1 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (1.6.0)
Requirement already satisfied: lazy_loader>=0.1 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (0.4)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-learn<1.4.1,>=1.3.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (3.4.0)
Requirement already satisfied: statsmodels>=0.13.2 in /home/zafar/.local/lib/python3.10/site-packages (from statsforecast<1.5,>=1.4.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.14.2)
Requirement already satisfied: absl-py>=0.4 in /home/zafar/.local/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (2.1.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/lib/python3/dist-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (3.3.6)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /home/zafar/.local/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /home/zafar/.local/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (3.0.2)
Requirement already satisfied: safetensors in /home/zafar/.local/lib/python3.10/site-packages (from timm<0.10.0,>=0.9.5->autogluon.multimodal==1.1.0->autogluon) (0.4.3)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105)
Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (8.9.2.26)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.3.1)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (11.0.2.54)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (10.3.2.106)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (11.4.5.107)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.0.106)
Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (2.18.1)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105)
Requirement already satisfied: triton==2.1.0 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (2.1.0)
Requirement already satisfied: nvidia-nvjitlink-cu12 in /home/zafar/.local/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.4.127)
Requirement already satisfied: tokenizers<0.19,>=0.14 in /home/zafar/.local/lib/python3.10/site-packages (from transformers<4.39.0,>=4.38.0->transformers[sentencepiece]<4.39.0,>=4.38.0->autogluon.multimodal==1.1.0->autogluon) (0.15.2)
Requirement already satisfied: sentencepiece!=0.1.92,>=0.1.91 in /home/zafar/.local/lib/python3.10/site-packages (from transformers[sentencepiece]<4.39.0,>=4.38.0->autogluon.multimodal==1.1.0->autogluon) (0.2.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/zafar/.local/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/zafar/.local/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/lib/python3/dist-packages (from aiohttp>=3.7->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (4.0.1)
Requirement already satisfied: pyarrow-hotfix in /home/zafar/.local/lib/python3.10/site-packages (from datasets>=2.0.0->evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.1.0->autogluon) (0.6)
Requirement already satisfied: beautifulsoup4 in /usr/lib/python3/dist-packages (from gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.1.0->autogluon) (4.10.0)
Requirement already satisfied: llvmlite<0.43,>=0.42.0dev0 in /home/zafar/.local/lib/python3.10/site-packages (from numba->mlforecast<0.10.1,>=0.10.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.42.0)
Requirement already satisfied: flatbuffers in /home/zafar/.local/lib/python3.10/site-packages (from onnxruntime>=1.11.0->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (24.3.25)
Requirement already satisfied: annotated-types>=0.4.0 in /home/zafar/.local/lib/python3.10/site-packages (from pydantic<3,>=1.7->gluonts<0.14.4,>=0.14.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.6.0)
Requirement already satisfied: pydantic-core==2.18.2 in /home/zafar/.local/lib/python3.10/site-packages (from pydantic<3,>=1.7->gluonts<0.14.4,>=0.14.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (2.18.2)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (8.2.3)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.1.2)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (2.0.10)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.3.4)
Requirement already satisfied: typer<0.10.0,>=0.3.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.9.4)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (3.4.0)
Requirement already satisfied: patsy>=0.5.6 in /home/zafar/.local/lib/python3.10/site-packages (from statsmodels>=0.13.2->statsforecast<1.5,>=1.4.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.5.6)
Requirement already satisfied: distlib<1,>=0.3.7 in /home/zafar/.local/lib/python3.10/site-packages (from virtualenv!=20.21.1,>=20.0.24->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.3.8)
Requirement already satisfied: platformdirs<5,>=3.9.1 in /home/zafar/.local/lib/python3.10/site-packages (from virtualenv!=20.21.1,>=20.0.24->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (4.2.1)
Requirement already satisfied: MarkupSafe>=2.1.1 in /home/zafar/.local/lib/python3.10/site-packages (from werkzeug>=1.0.1->tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (2.1.5)
Requirement already satisfied: humanfriendly>=9.1 in /home/zafar/.local/lib/python3.10/site-packages (from coloredlogs->optimum<1.19,>=1.17->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (10.0)
Requirement already satisfied: ordered-set in /home/zafar/.local/lib/python3.10/site-packages (from model-index->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (4.1.0)
Requirement already satisfied: opencensus-context>=0.1.3 in /home/zafar/.local/lib/python3.10/site-packages (from opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.1.3)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (2.18.0)
Requirement already satisfied: pycryptodome in /home/zafar/.local/lib/python3.10/site-packages (from opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (3.20.0)
Requirement already satisfied: openxlab in /home/zafar/.local/lib/python3.10/site-packages (from opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.0.38)
Requirement already satisfied: tenacity>=6.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from plotly->catboost<1.3,>=1.1->autogluon.tabular[all]==1.1.0->autogluon) (8.2.3)
Requirement already satisfied: markdown-it-py>=2.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/zafar/.local/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.17.2)
Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in /home/zafar/.local/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.63.0)
Requirement already satisfied: proto-plus<2.0.0dev,>=1.22.3 in /home/zafar/.local/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.23.0)
Requirement already satisfied: google-auth<3.0.dev0,>=2.14.1 in /home/zafar/.local/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (2.29.0)
Requirement already satisfied: language-data>=1.2 in /home/zafar/.local/lib/python3.10/site-packages (from langcodes<4.0.0,>=3.2.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.2.0)
Requirement already satisfied: mdurl~=0.1 in /home/zafar/.local/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.1.2)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /home/zafar/.local/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /home/zafar/.local/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.1.4)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /home/zafar/.local/lib/python3.10/site-packages (from weasel<0.4.0,>=0.1.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.16.0)
Requirement already satisfied: oss2~=2.17.0 in /home/zafar/.local/lib/python3.10/site-packages (from openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.17.0)
Collecting setuptools (from autogluon.common==1.1.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon)
  Downloading setuptools-60.2.0-py3-none-any.whl.metadata (5.1 kB)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /home/zafar/.local/lib/python3.10/site-packages (from requests[socks]->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.1.0->autogluon) (1.7.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (5.3.3)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/zafar/.local/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.4.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /home/zafar/.local/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (4.9)
Requirement already satisfied: marisa-trie>=0.7.7 in /home/zafar/.local/lib/python3.10/site-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.1.0)
Requirement already satisfied: crcmod>=1.7 in /home/zafar/.local/lib/python3.10/site-packages (from oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (1.7)
Requirement already satisfied: aliyun-python-sdk-kms>=2.4.1 in /home/zafar/.local/lib/python3.10/site-packages (from oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.16.2)
Requirement already satisfied: aliyun-python-sdk-core>=2.13.12 in /home/zafar/.local/lib/python3.10/site-packages (from oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.15.1)
Requirement already satisfied: cryptography>=2.6.0 in /usr/lib/python3/dist-packages (from aliyun-python-sdk-core>=2.13.12->oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (3.4.8)
Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /home/zafar/.local/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.6.0)
Downloading setuptools-60.2.0-py3-none-any.whl (953 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 953.1/953.1 kB 807.8 kB/s eta 0:00:00 kB/s eta 0:00:01:01
Installing collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 69.5.1
    Uninstalling setuptools-69.5.1:
      Successfully uninstalled setuptools-69.5.1
Successfully installed setuptools-60.2.0

Setup Kaggle API Key¶

In [1]:
# create the .kaggle directory and an empty kaggle.json file
!mkdir -p ~/.kaggle
!touch ~/.kaggle/kaggle.json
!chmod 600 ~/.kaggle/kaggle.json
In [2]:
import os
import json

# Get the user's home directory
home_dir = os.path.expanduser("~")

# Fill in your user name and key from creating the Kaggle account and API token file
kaggle_username = "zafarabdugaffarov"
kaggle_key = "2057716522fbae8f5161b48d587f51d6"

# Create the .kaggle directory if it doesn't exist
kaggle_dir = os.path.join(home_dir, ".kaggle")
if not os.path.exists(kaggle_dir):
    os.makedirs(kaggle_dir)

# Save API token to the kaggle.json file
kaggle_json_path = os.path.join(kaggle_dir, "kaggle.json")
with open(kaggle_json_path, "w") as f:
    json.dump({"username": kaggle_username, "key": kaggle_key}, f)

# Set appropriate permissions
os.chmod(kaggle_json_path, 0o600)

Download and explore dataset¶

Go to the bike sharing demand competition and agree to the terms¶

kaggle6.png

In [3]:
# Download the dataset, it will be in a .zip file so you'll need to unzip it as well.
!kaggle competitions download -c bike-sharing-demand
# If you already downloaded it you can use the -o command to overwrite the file
!unzip -o bike-sharing-demand.zip
/bin/bash: line 1: kaggle: command not found
unzip:  cannot find or open bike-sharing-demand.zip, bike-sharing-demand.zip.zip or bike-sharing-demand.zip.ZIP.
In [1]:
import pandas as pd
from autogluon.tabular import TabularPredictor
In [2]:
# Create the train dataset in pandas by reading the csv
# Set the parsing of the datetime column so you can use some of the `dt` features in pandas later
train = pd.read_csv('train.csv')
train.head()
Out[2]:
datetime season holiday workingday weather temp atemp humidity windspeed casual registered count
0 2011-01-01 00:00:00 1 0 0 1 9.84 14.395 81 0.0 3 13 16
1 2011-01-01 01:00:00 1 0 0 1 9.02 13.635 80 0.0 8 32 40
2 2011-01-01 02:00:00 1 0 0 1 9.02 13.635 80 0.0 5 27 32
3 2011-01-01 03:00:00 1 0 0 1 9.84 14.395 75 0.0 3 10 13
4 2011-01-01 04:00:00 1 0 0 1 9.84 14.395 75 0.0 0 1 1
In [3]:
# Simple output of the train dataset to view some of the min/max/varition of the dataset features.
train.describe()
Out[3]:
season holiday workingday weather temp atemp humidity windspeed casual registered count
count 10886.000000 10886.000000 10886.000000 10886.000000 10886.00000 10886.000000 10886.000000 10886.000000 10886.000000 10886.000000 10886.000000
mean 2.506614 0.028569 0.680875 1.418427 20.23086 23.655084 61.886460 12.799395 36.021955 155.552177 191.574132
std 1.116174 0.166599 0.466159 0.633839 7.79159 8.474601 19.245033 8.164537 49.960477 151.039033 181.144454
min 1.000000 0.000000 0.000000 1.000000 0.82000 0.760000 0.000000 0.000000 0.000000 0.000000 1.000000
25% 2.000000 0.000000 0.000000 1.000000 13.94000 16.665000 47.000000 7.001500 4.000000 36.000000 42.000000
50% 3.000000 0.000000 1.000000 1.000000 20.50000 24.240000 62.000000 12.998000 17.000000 118.000000 145.000000
75% 4.000000 0.000000 1.000000 2.000000 26.24000 31.060000 77.000000 16.997900 49.000000 222.000000 284.000000
max 4.000000 1.000000 1.000000 4.000000 41.00000 45.455000 100.000000 56.996900 367.000000 886.000000 977.000000
In [4]:
# Create the test pandas dataframe in pandas by reading the csv, remember to parse the datetime!
test = pd.read_csv('test.csv')
test.head()
Out[4]:
datetime season holiday workingday weather temp atemp humidity windspeed
0 2011-01-20 00:00:00 1 0 1 1 10.66 11.365 56 26.0027
1 2011-01-20 01:00:00 1 0 1 1 10.66 13.635 56 0.0000
2 2011-01-20 02:00:00 1 0 1 1 10.66 13.635 56 0.0000
3 2011-01-20 03:00:00 1 0 1 1 10.66 12.880 56 11.0014
4 2011-01-20 04:00:00 1 0 1 1 10.66 12.880 56 11.0014
In [5]:
test.describe()
Out[5]:
season holiday workingday weather temp atemp humidity windspeed
count 6493.000000 6493.000000 6493.000000 6493.000000 6493.000000 6493.000000 6493.000000 6493.000000
mean 2.493300 0.029108 0.685815 1.436778 20.620607 24.012865 64.125212 12.631157
std 1.091258 0.168123 0.464226 0.648390 8.059583 8.782741 19.293391 8.250151
min 1.000000 0.000000 0.000000 1.000000 0.820000 0.000000 16.000000 0.000000
25% 2.000000 0.000000 0.000000 1.000000 13.940000 16.665000 49.000000 7.001500
50% 3.000000 0.000000 1.000000 1.000000 21.320000 25.000000 65.000000 11.001400
75% 3.000000 0.000000 1.000000 2.000000 27.060000 31.060000 81.000000 16.997900
max 4.000000 1.000000 1.000000 4.000000 40.180000 50.000000 100.000000 55.998600
In [6]:
# Same thing as train and test dataset
submission = pd.read_csv('sampleSubmission.csv')
submission.head()
Out[6]:
datetime count
0 2011-01-20 00:00:00 0
1 2011-01-20 01:00:00 0
2 2011-01-20 02:00:00 0
3 2011-01-20 03:00:00 0
4 2011-01-20 04:00:00 0

Step 3: Train a model using AutoGluon’s Tabular Prediction¶

Requirements:

  • We are prediting count, so it is the label we are setting.
  • Ignore casual and registered columns as they are also not present in the test dataset.
  • Use the root_mean_squared_error as the metric to use for evaluation.
  • Set a time limit of 10 minutes (600 seconds).
  • Use the preset best_quality to focus on creating the best model.
In [10]:
predictor = TabularPredictor(
    label="count", problem_type="regression", eval_metric="rmse"
    ).fit(
    train_data=train.drop(['casual', 'registered'], axis=1),
    time_limit=600,
    presets='best_quality')
No path specified. Models will be saved in: "AutogluonModels/ag-20240429_105322"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20240429_105322"
AutoGluon Version:  0.8.2
Python Version:     3.10.14
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Sat Mar 23 09:49:55 UTC 2024
Disk Space Avail:   4.42 GB / 5.36 GB (82.4%)
	WARNING: Available disk space is low and there is a risk that AutoGluon will run out of disk during fit, causing an exception. 
	We recommend a minimum available disk space of 10 GB, and large datasets may require more.
Train Data Rows:    10886
Train Data Columns: 9
Label Column: count
Preprocessing data ...
/opt/conda/lib/python3.10/site-packages/autogluon/tabular/learner/default_learner.py:215: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context("mode.use_inf_as_na", True):  # treat None, NaN, INF, NINF as NA
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    2343.77 MB
	Train Data (Original)  Memory Usage: 1.52 MB (0.1% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
  X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting DatetimeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('float', [])                      : 3 | ['temp', 'atemp', 'windspeed']
		('int', [])                        : 5 | ['season', 'holiday', 'workingday', 'weather', 'humidity']
		('object', ['datetime_as_object']) : 1 | ['datetime']
	Types of features in processed data (raw dtype, special dtypes):
		('float', [])                : 3 | ['temp', 'atemp', 'windspeed']
		('int', [])                  : 3 | ['season', 'weather', 'humidity']
		('int', ['bool'])            : 2 | ['holiday', 'workingday']
		('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
	0.1s = Fit runtime
	9 features in original data used to generate 13 features in processed data.
	Train Data (Processed) Memory Usage: 0.98 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.19s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
	This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
	To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': {},
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
	'CAT': {},
	'XGB': {},
	'FASTAI': {},
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.77s of the 599.81s of remaining time.
	-101.5462	 = Validation score   (-root_mean_squared_error)
	0.04s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 396.29s of the 596.32s of remaining time.
	-84.1251	 = Validation score   (-root_mean_squared_error)
	0.04s	 = Training   runtime
	0.1s	 = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 396.11s of the 596.14s of remaining time.
Will use sequential fold fitting strategy because import of ray failed. Reason: ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.6.3`
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
/opt/conda/lib/python3.10/site-packages/dask/dataframe/_pyarrow_compat.py:17: FutureWarning: Minimal version of pyarrow will soon be increased to 14.0.1. You are using 12.0.1. Please consider upgrading.
  warnings.warn(
/opt/conda/lib/python3.10/site-packages/dask/dataframe/__init__.py:31: FutureWarning: 
Dask dataframe query planning is disabled because dask-expr is not installed.

You can install it with `pip install dask[dataframe]` or `conda install dask`.
This will raise in a future version.

  warnings.warn(msg, FutureWarning)
[1000]	valid_set's rmse: 131.684
[2000]	valid_set's rmse: 130.67
[3000]	valid_set's rmse: 130.626
[1000]	valid_set's rmse: 135.592
[1000]	valid_set's rmse: 133.481
[2000]	valid_set's rmse: 132.323
[3000]	valid_set's rmse: 131.618
[4000]	valid_set's rmse: 131.443
[5000]	valid_set's rmse: 131.265
[6000]	valid_set's rmse: 131.277
[7000]	valid_set's rmse: 131.443
[1000]	valid_set's rmse: 128.503
[2000]	valid_set's rmse: 127.654
[3000]	valid_set's rmse: 127.227
[4000]	valid_set's rmse: 127.105
[1000]	valid_set's rmse: 134.135
[2000]	valid_set's rmse: 132.272
[3000]	valid_set's rmse: 131.286
[4000]	valid_set's rmse: 130.752
[5000]	valid_set's rmse: 130.363
[6000]	valid_set's rmse: 130.509
[1000]	valid_set's rmse: 136.168
[2000]	valid_set's rmse: 135.138
[3000]	valid_set's rmse: 135.029
[1000]	valid_set's rmse: 134.061
[2000]	valid_set's rmse: 133.034
[3000]	valid_set's rmse: 132.182
[4000]	valid_set's rmse: 131.997
[5000]	valid_set's rmse: 131.643
[6000]	valid_set's rmse: 131.504
[7000]	valid_set's rmse: 131.574
[1000]	valid_set's rmse: 132.912
[2000]	valid_set's rmse: 131.703
[3000]	valid_set's rmse: 131.117
[4000]	valid_set's rmse: 130.82
[5000]	valid_set's rmse: 130.673
[6000]	valid_set's rmse: 130.708
	-131.4609	 = Validation score   (-root_mean_squared_error)
	51.98s	 = Training   runtime
	8.0s	 = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 328.61s of the 528.64s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000]	valid_set's rmse: 130.818
[1000]	valid_set's rmse: 133.204
[1000]	valid_set's rmse: 130.928
[1000]	valid_set's rmse: 126.846
[1000]	valid_set's rmse: 131.426
[1000]	valid_set's rmse: 133.655
[1000]	valid_set's rmse: 132.155
[1000]	valid_set's rmse: 130.62
	-131.0542	 = Validation score   (-root_mean_squared_error)
	12.82s	 = Training   runtime
	1.39s	 = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 312.64s of the 512.68s of remaining time.
	-116.5484	 = Validation score   (-root_mean_squared_error)
	19.04s	 = Training   runtime
	0.82s	 = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 291.92s of the 491.96s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Ran out of time, early stopping on iteration 4050.
	Ran out of time, early stopping on iteration 3990.
	Ran out of time, early stopping on iteration 4363.
	Ran out of time, early stopping on iteration 4403.
	Ran out of time, early stopping on iteration 4811.
	-130.5713	 = Validation score   (-root_mean_squared_error)
	240.14s	 = Training   runtime
	0.11s	 = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 51.49s of the 251.52s of remaining time.
	-124.6007	 = Validation score   (-root_mean_squared_error)
	8.29s	 = Training   runtime
	0.51s	 = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 42.04s of the 242.07s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Ran out of time, stopping training early. (Stopping on epoch 6)
	Ran out of time, stopping training early. (Stopping on epoch 7)
	Ran out of time, stopping training early. (Stopping on epoch 9)
	Ran out of time, stopping training early. (Stopping on epoch 11)
	Ran out of time, stopping training early. (Stopping on epoch 12)
	-140.3195	 = Validation score   (-root_mean_squared_error)
	40.23s	 = Training   runtime
	0.54s	 = Validation runtime
Fitting model: XGBoost_BAG_L1 ... Training model for up to 0.95s of the 200.98s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Time limit exceeded... Skipping XGBoost_BAG_L1.
Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 0.65s of the 200.68s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Time limit exceeded... Skipping NeuralNetTorch_BAG_L1.
Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 0.44s of the 200.48s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Ran out of time, early stopping on iteration 1. Best iteration is:
	[1]	valid_set's rmse: 179.334
	Time limit exceeded... Skipping LightGBMLarge_BAG_L1.
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 199.78s of remaining time.
	-84.1251	 = Validation score   (-root_mean_squared_error)
	0.73s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 198.98s of the 198.96s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000]	valid_set's rmse: 60.9213
[2000]	valid_set's rmse: 60.0388
[3000]	valid_set's rmse: 59.8521
[1000]	valid_set's rmse: 61.2639
[2000]	valid_set's rmse: 60.3481
[1000]	valid_set's rmse: 64.0419
[2000]	valid_set's rmse: 62.8485
[1000]	valid_set's rmse: 64.4371
[2000]	valid_set's rmse: 62.5034
[3000]	valid_set's rmse: 62.3424
[1000]	valid_set's rmse: 58.7129
[2000]	valid_set's rmse: 57.6587
[1000]	valid_set's rmse: 63.5234
[2000]	valid_set's rmse: 62.3591
[1000]	valid_set's rmse: 62.7864
[2000]	valid_set's rmse: 61.7307
[3000]	valid_set's rmse: 61.6274
[1000]	valid_set's rmse: 57.7822
[2000]	valid_set's rmse: 57.105
	-60.4701	 = Validation score   (-root_mean_squared_error)
	46.59s	 = Training   runtime
	3.89s	 = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 144.17s of the 144.15s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	-55.134	 = Validation score   (-root_mean_squared_error)
	12.76s	 = Training   runtime
	0.21s	 = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 130.87s of the 130.85s of remaining time.
	-53.4515	 = Validation score   (-root_mean_squared_error)
	42.03s	 = Training   runtime
	0.72s	 = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 87.48s of the 87.46s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Ran out of time, early stopping on iteration 1047.
	Ran out of time, early stopping on iteration 1188.
	Ran out of time, early stopping on iteration 1307.
	Ran out of time, early stopping on iteration 1348.
	Ran out of time, early stopping on iteration 1445.
	-55.4685	 = Validation score   (-root_mean_squared_error)
	78.22s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 9.11s of the 9.09s of remaining time.
	-53.8593	 = Validation score   (-root_mean_squared_error)
	15.39s	 = Training   runtime
	0.85s	 = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -7.67s of remaining time.
	-52.8658	 = Validation score   (-root_mean_squared_error)
	0.29s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 607.99s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240429_105322")

Review AutoGluon's training run with ranking of models that did the best.¶

In [11]:
predictor.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
                     model   score_val  pred_time_val    fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0      WeightedEnsemble_L3  -52.865846      13.389465  521.269552                0.000817           0.287105            3       True         15
1   RandomForestMSE_BAG_L2  -53.451478      12.257624  414.610369                0.722348          42.028500            2       True         12
2     ExtraTreesMSE_BAG_L2  -53.859260      12.389017  387.974257                0.853741          15.392388            2       True         14
3          LightGBM_BAG_L2  -55.133954      11.748551  385.337010                0.213275          12.755141            2       True         11
4          CatBoost_BAG_L2  -55.468453      11.599283  450.806419                0.064007          78.224550            2       True         13
5        LightGBMXT_BAG_L2  -60.470103      15.426254  419.170218                3.890978          46.588348            2       True         10
6    KNeighborsDist_BAG_L1  -84.125061       0.104822    0.036808                0.104822           0.036808            1       True          2
7      WeightedEnsemble_L2  -84.125061       0.106286    0.763713                0.001464           0.726904            2       True          9
8    KNeighborsUnif_BAG_L1 -101.546199       0.056908    0.039038                0.056908           0.039038            1       True          1
9   RandomForestMSE_BAG_L1 -116.548359       0.817105   19.042239                0.817105          19.042239            1       True          5
10    ExtraTreesMSE_BAG_L1 -124.600676       0.513394    8.290484                0.513394           8.290484            1       True          7
11         CatBoost_BAG_L1 -130.571286       0.110086  240.143042                0.110086         240.143042            1       True          6
12         LightGBM_BAG_L1 -131.054162       1.394772   12.816886                1.394772          12.816886            1       True          4
13       LightGBMXT_BAG_L1 -131.460909       7.996730   51.984224                7.996730          51.984224            1       True          3
14  NeuralNetFastAI_BAG_L1 -140.319540       0.541459   40.229147                0.541459          40.229147            1       True          8
Number of models trained: 15
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_NNFastAiTabular', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_KNN'}
Bagging used: True  (with 8 folds)
Multi-layer stack-ensembling used: True  (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', [])                : 3 | ['temp', 'atemp', 'windspeed']
('int', [])                  : 3 | ['season', 'weather', 'humidity']
('int', ['bool'])            : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
*** End of fit() summary ***
/opt/conda/lib/python3.10/site-packages/autogluon/core/utils/plots.py:169: UserWarning: AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"
  warnings.warn('AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"')
Out[11]:
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
  'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
  'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
  'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
  'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
  'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
  'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
  'WeightedEnsemble_L2': 'WeightedEnsembleModel',
  'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
  'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
  'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
  'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
  'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
 'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
  'KNeighborsDist_BAG_L1': -84.12506123181602,
  'LightGBMXT_BAG_L1': -131.46090891834504,
  'LightGBM_BAG_L1': -131.054161598899,
  'RandomForestMSE_BAG_L1': -116.54835939455667,
  'CatBoost_BAG_L1': -130.57128624162138,
  'ExtraTreesMSE_BAG_L1': -124.60067564699747,
  'NeuralNetFastAI_BAG_L1': -140.31953985560796,
  'WeightedEnsemble_L2': -84.12506123181602,
  'LightGBMXT_BAG_L2': -60.47010263935106,
  'LightGBM_BAG_L2': -55.133953757189154,
  'RandomForestMSE_BAG_L2': -53.45147772878295,
  'CatBoost_BAG_L2': -55.46845278345196,
  'ExtraTreesMSE_BAG_L2': -53.8592596631687,
  'WeightedEnsemble_L3': -52.86584564696691},
 'model_best': 'WeightedEnsemble_L3',
 'model_paths': {'KNeighborsUnif_BAG_L1': ['KNeighborsUnif_BAG_L1'],
  'KNeighborsDist_BAG_L1': ['KNeighborsDist_BAG_L1'],
  'LightGBMXT_BAG_L1': ['LightGBMXT_BAG_L1'],
  'LightGBM_BAG_L1': ['LightGBM_BAG_L1'],
  'RandomForestMSE_BAG_L1': ['RandomForestMSE_BAG_L1'],
  'CatBoost_BAG_L1': ['CatBoost_BAG_L1'],
  'ExtraTreesMSE_BAG_L1': ['ExtraTreesMSE_BAG_L1'],
  'NeuralNetFastAI_BAG_L1': ['NeuralNetFastAI_BAG_L1'],
  'WeightedEnsemble_L2': ['WeightedEnsemble_L2'],
  'LightGBMXT_BAG_L2': ['LightGBMXT_BAG_L2'],
  'LightGBM_BAG_L2': ['LightGBM_BAG_L2'],
  'RandomForestMSE_BAG_L2': ['RandomForestMSE_BAG_L2'],
  'CatBoost_BAG_L2': ['CatBoost_BAG_L2'],
  'ExtraTreesMSE_BAG_L2': ['ExtraTreesMSE_BAG_L2'],
  'WeightedEnsemble_L3': ['WeightedEnsemble_L3']},
 'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.03903794288635254,
  'KNeighborsDist_BAG_L1': 0.03680849075317383,
  'LightGBMXT_BAG_L1': 51.98422408103943,
  'LightGBM_BAG_L1': 12.816885948181152,
  'RandomForestMSE_BAG_L1': 19.04223918914795,
  'CatBoost_BAG_L1': 240.14304184913635,
  'ExtraTreesMSE_BAG_L1': 8.290484189987183,
  'NeuralNetFastAI_BAG_L1': 40.22914743423462,
  'WeightedEnsemble_L2': 0.7269041538238525,
  'LightGBMXT_BAG_L2': 46.588348388671875,
  'LightGBM_BAG_L2': 12.755140781402588,
  'RandomForestMSE_BAG_L2': 42.02849984169006,
  'CatBoost_BAG_L2': 78.2245500087738,
  'ExtraTreesMSE_BAG_L2': 15.392388105392456,
  'WeightedEnsemble_L3': 0.28710460662841797},
 'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.056908369064331055,
  'KNeighborsDist_BAG_L1': 0.10482192039489746,
  'LightGBMXT_BAG_L1': 7.996730327606201,
  'LightGBM_BAG_L1': 1.3947715759277344,
  'RandomForestMSE_BAG_L1': 0.8171048164367676,
  'CatBoost_BAG_L1': 0.11008644104003906,
  'ExtraTreesMSE_BAG_L1': 0.5133936405181885,
  'NeuralNetFastAI_BAG_L1': 0.5414590835571289,
  'WeightedEnsemble_L2': 0.0014636516571044922,
  'LightGBMXT_BAG_L2': 3.8909778594970703,
  'LightGBM_BAG_L2': 0.21327519416809082,
  'RandomForestMSE_BAG_L2': 0.7223482131958008,
  'CatBoost_BAG_L2': 0.06400728225708008,
  'ExtraTreesMSE_BAG_L2': 0.853740930557251,
  'WeightedEnsemble_L3': 0.0008168220520019531},
 'num_bag_folds': 8,
 'max_stack_level': 3,
 'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'KNeighborsDist_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'LightGBMXT_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'RandomForestMSE_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'CatBoost_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'WeightedEnsemble_L2': {'use_orig_features': False,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBMXT_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'RandomForestMSE_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'CatBoost_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'WeightedEnsemble_L3': {'use_orig_features': False,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True}},
 'leaderboard':                      model   score_val  pred_time_val    fit_time  \
 0      WeightedEnsemble_L3  -52.865846      13.389465  521.269552   
 1   RandomForestMSE_BAG_L2  -53.451478      12.257624  414.610369   
 2     ExtraTreesMSE_BAG_L2  -53.859260      12.389017  387.974257   
 3          LightGBM_BAG_L2  -55.133954      11.748551  385.337010   
 4          CatBoost_BAG_L2  -55.468453      11.599283  450.806419   
 5        LightGBMXT_BAG_L2  -60.470103      15.426254  419.170218   
 6    KNeighborsDist_BAG_L1  -84.125061       0.104822    0.036808   
 7      WeightedEnsemble_L2  -84.125061       0.106286    0.763713   
 8    KNeighborsUnif_BAG_L1 -101.546199       0.056908    0.039038   
 9   RandomForestMSE_BAG_L1 -116.548359       0.817105   19.042239   
 10    ExtraTreesMSE_BAG_L1 -124.600676       0.513394    8.290484   
 11         CatBoost_BAG_L1 -130.571286       0.110086  240.143042   
 12         LightGBM_BAG_L1 -131.054162       1.394772   12.816886   
 13       LightGBMXT_BAG_L1 -131.460909       7.996730   51.984224   
 14  NeuralNetFastAI_BAG_L1 -140.319540       0.541459   40.229147   
 
     pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  \
 0                 0.000817           0.287105            3       True   
 1                 0.722348          42.028500            2       True   
 2                 0.853741          15.392388            2       True   
 3                 0.213275          12.755141            2       True   
 4                 0.064007          78.224550            2       True   
 5                 3.890978          46.588348            2       True   
 6                 0.104822           0.036808            1       True   
 7                 0.001464           0.726904            2       True   
 8                 0.056908           0.039038            1       True   
 9                 0.817105          19.042239            1       True   
 10                0.513394           8.290484            1       True   
 11                0.110086         240.143042            1       True   
 12                1.394772          12.816886            1       True   
 13                7.996730          51.984224            1       True   
 14                0.541459          40.229147            1       True   
 
     fit_order  
 0          15  
 1          12  
 2          14  
 3          11  
 4          13  
 5          10  
 6           2  
 7           9  
 8           1  
 9           5  
 10          7  
 11          6  
 12          4  
 13          3  
 14          8  }

Create predictions from test dataset¶

In [12]:
predictions = predictor.predict(test)
predictions = {'datetime': test['datetime'], 'Pred_count': predictions}
predictions = pd.DataFrame(data=predictions)
predictions.head()
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
  X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
Out[12]:
datetime Pred_count
0 2011-01-20 00:00:00 23.776628
1 2011-01-20 01:00:00 42.635658
2 2011-01-20 02:00:00 46.400299
3 2011-01-20 03:00:00 49.024002
4 2011-01-20 04:00:00 51.627224

NOTE: Kaggle will reject the submission if we don't set everything to be > 0.¶

In [13]:
# Describe the `predictions` series to see if there are any negative values
predictions.describe()
Out[13]:
Pred_count
count 6493.000000
mean 101.049049
std 90.335938
min 3.017506
25% 20.798779
50% 63.234932
75% 169.659332
max 362.337433
In [14]:
# How many negative values do we have?
neg = predictions.groupby(predictions['Pred_count'])

# lambda function
def minus(val):
   return val[val < 0].sum()

print(neg['Pred_count'].agg([('negcount', minus)]))
            negcount
Pred_count          
3.017506         0.0
3.048299         0.0
3.086751         0.0
3.100743         0.0
3.101935         0.0
...              ...
361.426758       0.0
361.816986       0.0
362.043579       0.0
362.062012       0.0
362.337433       0.0

[6263 rows x 1 columns]
In [15]:
# Set them to zero
predictions[predictions['Pred_count']<0] = 0
In [16]:
predictions.describe()
Out[16]:
Pred_count
count 6493.000000
mean 101.049049
std 90.335938
min 3.017506
25% 20.798779
50% 63.234932
75% 169.659332
max 362.337433
In [17]:
predictions.head()
Out[17]:
datetime Pred_count
0 2011-01-20 00:00:00 23.776628
1 2011-01-20 01:00:00 42.635658
2 2011-01-20 02:00:00 46.400299
3 2011-01-20 03:00:00 49.024002
4 2011-01-20 04:00:00 51.627224

Set predictions to submission dataframe, save, and submit¶

In [18]:
submission["count"] = predictions['Pred_count']
submission.to_csv("submission.csv", index=False)
In [21]:
!kaggle competitions submit -c bike-sharing-demand -f submission.csv -m "first raw submission"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 612kB/s]
Successfully submitted to Bike Sharing Demand

View submission via the command line or in the web browser under the competition's page - My Submissions¶

In [40]:
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName        date                 description           status    publicScore  privateScore  
--------------  -------------------  --------------------  --------  -----------  ------------  
submission.csv  2024-04-29 11:12:02  first raw submission  complete  1.7998       1.7998        

Initial score of 1.7998¶

Step 4: Exploratory Data Analysis and Creating an additional feature¶

  • Any additional feature will do, but a great suggestion would be to separate out the datetime into hour, day, or month parts.
In [23]:
# Create a histogram of all features to show the distribution of each one relative to the data. This is part of the exploritory data analysis
train.hist()
Out[23]:
array([[<Axes: title={'center': 'season'}>,
        <Axes: title={'center': 'holiday'}>,
        <Axes: title={'center': 'workingday'}>],
       [<Axes: title={'center': 'weather'}>,
        <Axes: title={'center': 'temp'}>,
        <Axes: title={'center': 'atemp'}>],
       [<Axes: title={'center': 'humidity'}>,
        <Axes: title={'center': 'windspeed'}>,
        <Axes: title={'center': 'casual'}>],
       [<Axes: title={'center': 'registered'}>,
        <Axes: title={'center': 'count'}>, <Axes: >]], dtype=object)
No description has been provided for this image
In [26]:
# create a new feature
# Assuming train and test are your DataFrames
train['datetime'] = pd.to_datetime(train['datetime'])
test['datetime'] = pd.to_datetime(test['datetime'])

# Now, you can access the components of the datetime column
train['year'] = train['datetime'].dt.year
train['month'] = train['datetime'].dt.month
train['day'] = train['datetime'].dt.day
train['hour'] = train['datetime'].dt.hour

test['year'] = test['datetime'].dt.year
test['month'] = test['datetime'].dt.month
test['day'] = test['datetime'].dt.day
test['hour'] = test['datetime'].dt.hour

Make category types for these so models know they are not just numbers¶

  • AutoGluon originally sees these as ints, but in reality they are int representations of a category.
  • Setting the dtype to category will classify these as categories in AutoGluon.
In [27]:
train["season"] = train["season"].astype("category")
train["weather"] = train["weather"].astype("category")
test["season"] = test["season"].astype("category")
test["weather"] = test["weather"].astype("category")
In [28]:
# View are new feature
train.head()
Out[28]:
datetime season holiday workingday weather temp atemp humidity windspeed casual registered count year month day hour
0 2011-01-01 00:00:00 1 0 0 1 9.84 14.395 81 0.0 3 13 16 2011 1 1 0
1 2011-01-01 01:00:00 1 0 0 1 9.02 13.635 80 0.0 8 32 40 2011 1 1 1
2 2011-01-01 02:00:00 1 0 0 1 9.02 13.635 80 0.0 5 27 32 2011 1 1 2
3 2011-01-01 03:00:00 1 0 0 1 9.84 14.395 75 0.0 3 10 13 2011 1 1 3
4 2011-01-01 04:00:00 1 0 0 1 9.84 14.395 75 0.0 0 1 1 2011 1 1 4
In [29]:
# View histogram of all features again now with the hour feature
train.hist()
Out[29]:
array([[<Axes: title={'center': 'datetime'}>,
        <Axes: title={'center': 'holiday'}>,
        <Axes: title={'center': 'workingday'}>,
        <Axes: title={'center': 'temp'}>],
       [<Axes: title={'center': 'atemp'}>,
        <Axes: title={'center': 'humidity'}>,
        <Axes: title={'center': 'windspeed'}>,
        <Axes: title={'center': 'casual'}>],
       [<Axes: title={'center': 'registered'}>,
        <Axes: title={'center': 'count'}>,
        <Axes: title={'center': 'year'}>,
        <Axes: title={'center': 'month'}>],
       [<Axes: title={'center': 'day'}>,
        <Axes: title={'center': 'hour'}>, <Axes: >, <Axes: >]],
      dtype=object)
No description has been provided for this image

Step 5: Rerun the model with the same settings as before, just with more features¶

In [30]:
predictor_new_features = TabularPredictor(
    label="count", problem_type="regression", eval_metric="rmse"
    ).fit(
    train_data=train.drop(['casual', 'registered'], axis=1),
    time_limit=600,
    presets='best_quality')
No path specified. Models will be saved in: "AutogluonModels/ag-20240429_111603"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20240429_111603"
AutoGluon Version:  0.8.2
Python Version:     3.10.14
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Sat Mar 23 09:49:55 UTC 2024
Disk Space Avail:   3.05 GB / 5.36 GB (57.0%)
	WARNING: Available disk space is low and there is a risk that AutoGluon will run out of disk during fit, causing an exception. 
	We recommend a minimum available disk space of 10 GB, and large datasets may require more.
Train Data Rows:    10886
Train Data Columns: 13
Label Column: count
Preprocessing data ...
/opt/conda/lib/python3.10/site-packages/autogluon/tabular/learner/default_learner.py:215: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context("mode.use_inf_as_na", True):  # treat None, NaN, INF, NINF as NA
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    1952.55 MB
	Train Data (Original)  Memory Usage: 0.81 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting CategoryFeatureGenerator...
			Fitting CategoryMemoryMinimizeFeatureGenerator...
		Fitting DatetimeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('category', []) : 2 | ['season', 'weather']
		('datetime', []) : 1 | ['datetime']
		('float', [])    : 3 | ['temp', 'atemp', 'windspeed']
		('int', [])      : 7 | ['holiday', 'workingday', 'humidity', 'year', 'month', ...]
	Types of features in processed data (raw dtype, special dtypes):
		('category', [])             : 2 | ['season', 'weather']
		('float', [])                : 3 | ['temp', 'atemp', 'windspeed']
		('int', [])                  : 4 | ['humidity', 'month', 'day', 'hour']
		('int', ['bool'])            : 3 | ['holiday', 'workingday', 'year']
		('int', ['datetime_as_int']) : 3 | ['datetime', 'datetime.year', 'datetime.dayofweek']
	1.4s = Fit runtime
	13 features in original data used to generate 15 features in processed data.
	Train Data (Processed) Memory Usage: 0.8 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 1.47s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
	This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
	To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': {},
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
	'CAT': {},
	'XGB': {},
	'FASTAI': {},
	'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
	'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 398.92s of the 598.52s of remaining time.
	-101.5462	 = Validation score   (-root_mean_squared_error)
	0.04s	 = Training   runtime
	0.04s	 = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 398.79s of the 598.4s of remaining time.
	-84.1251	 = Validation score   (-root_mean_squared_error)
	0.04s	 = Training   runtime
	0.05s	 = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.66s of the 598.27s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000]	valid_set's rmse: 35.722
[2000]	valid_set's rmse: 34.0646
[3000]	valid_set's rmse: 33.7501
[4000]	valid_set's rmse: 33.5663
[5000]	valid_set's rmse: 33.5927
[1000]	valid_set's rmse: 36.6943
[2000]	valid_set's rmse: 34.7009
[3000]	valid_set's rmse: 34.2654
[4000]	valid_set's rmse: 34.0805
[5000]	valid_set's rmse: 34.0068
[6000]	valid_set's rmse: 33.9926
[7000]	valid_set's rmse: 34.0148
[8000]	valid_set's rmse: 34.0505
[1000]	valid_set's rmse: 37.0225
[2000]	valid_set's rmse: 34.5264
[3000]	valid_set's rmse: 33.9428
[4000]	valid_set's rmse: 33.6752
[5000]	valid_set's rmse: 33.5411
[6000]	valid_set's rmse: 33.4628
[7000]	valid_set's rmse: 33.3908
[8000]	valid_set's rmse: 33.3862
[9000]	valid_set's rmse: 33.3645
[10000]	valid_set's rmse: 33.3686
[1000]	valid_set's rmse: 38.1752
[2000]	valid_set's rmse: 36.5188
[3000]	valid_set's rmse: 36.1264
[4000]	valid_set's rmse: 35.9954
[5000]	valid_set's rmse: 35.9337
[6000]	valid_set's rmse: 35.9463
[1000]	valid_set's rmse: 38.9031
[2000]	valid_set's rmse: 36.7896
[3000]	valid_set's rmse: 36.3287
[4000]	valid_set's rmse: 36.2175
[5000]	valid_set's rmse: 36.1359
[6000]	valid_set's rmse: 36.0948
[7000]	valid_set's rmse: 36.174
[1000]	valid_set's rmse: 35.8977
[2000]	valid_set's rmse: 33.4992
[3000]	valid_set's rmse: 32.7907
[4000]	valid_set's rmse: 32.4471
[5000]	valid_set's rmse: 32.2892
[6000]	valid_set's rmse: 32.2846
[7000]	valid_set's rmse: 32.2649
[8000]	valid_set's rmse: 32.3084
[1000]	valid_set's rmse: 38.3394
[2000]	valid_set's rmse: 37.1199
[3000]	valid_set's rmse: 36.8417
[4000]	valid_set's rmse: 36.6798
[5000]	valid_set's rmse: 36.6466
[6000]	valid_set's rmse: 36.6288
[7000]	valid_set's rmse: 36.6832
[1000]	valid_set's rmse: 35.8969
[2000]	valid_set's rmse: 34.1606
[3000]	valid_set's rmse: 33.8527
[4000]	valid_set's rmse: 33.714
[5000]	valid_set's rmse: 33.6917
	-34.4539	 = Validation score   (-root_mean_squared_error)
	82.09s	 = Training   runtime
	14.9s	 = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 290.28s of the 489.88s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000]	valid_set's rmse: 33.1713
[2000]	valid_set's rmse: 33.0077
[1000]	valid_set's rmse: 32.8635
[2000]	valid_set's rmse: 32.6404
[1000]	valid_set's rmse: 31.9543
[2000]	valid_set's rmse: 31.343
[3000]	valid_set's rmse: 30.9039
[4000]	valid_set's rmse: 30.8612
[1000]	valid_set's rmse: 35.8483
[2000]	valid_set's rmse: 35.4773
[3000]	valid_set's rmse: 35.3993
[1000]	valid_set's rmse: 35.5388
[1000]	valid_set's rmse: 31.6283
[1000]	valid_set's rmse: 37.9327
[2000]	valid_set's rmse: 37.4577
[1000]	valid_set's rmse: 34.9434
[2000]	valid_set's rmse: 34.6719
	-33.9173	 = Validation score   (-root_mean_squared_error)
	30.95s	 = Training   runtime
	3.36s	 = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 252.17s of the 451.77s of remaining time.
	-38.425	 = Validation score   (-root_mean_squared_error)
	20.96s	 = Training   runtime
	0.61s	 = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 229.93s of the 429.54s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Ran out of time, early stopping on iteration 2051.
	Ran out of time, early stopping on iteration 2220.
	Ran out of time, early stopping on iteration 2283.
	Ran out of time, early stopping on iteration 2382.
	Ran out of time, early stopping on iteration 2447.
	Ran out of time, early stopping on iteration 2516.
	Ran out of time, early stopping on iteration 2871.
	Ran out of time, early stopping on iteration 3160.
	-34.342	 = Validation score   (-root_mean_squared_error)
	220.46s	 = Training   runtime
	0.12s	 = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 9.24s of the 208.84s of remaining time.
	-38.1073	 = Validation score   (-root_mean_squared_error)
	9.76s	 = Training   runtime
	0.8s	 = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 197.54s of remaining time.
	-32.2137	 = Validation score   (-root_mean_squared_error)
	0.51s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 197.0s of the 196.99s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000]	valid_set's rmse: 30.5043
[1000]	valid_set's rmse: 31.5792
	-31.1511	 = Validation score   (-root_mean_squared_error)
	19.62s	 = Training   runtime
	1.04s	 = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 175.13s of the 175.11s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	-30.5897	 = Validation score   (-root_mean_squared_error)
	14.36s	 = Training   runtime
	0.26s	 = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 160.18s of the 160.17s of remaining time.
	-31.6744	 = Validation score   (-root_mean_squared_error)
	51.4s	 = Training   runtime
	0.82s	 = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 107.29s of the 107.28s of remaining time.
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Ran out of time, early stopping on iteration 866.
	Ran out of time, early stopping on iteration 946.
	Ran out of time, early stopping on iteration 931.
	Ran out of time, early stopping on iteration 901.
	Ran out of time, early stopping on iteration 1000.
	Ran out of time, early stopping on iteration 1066.
	Ran out of time, early stopping on iteration 1134.
	Ran out of time, early stopping on iteration 1403.
	-30.5154	 = Validation score   (-root_mean_squared_error)
	102.8s	 = Training   runtime
	0.1s	 = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 4.31s of the 4.3s of remaining time.
	-31.4851	 = Validation score   (-root_mean_squared_error)
	15.74s	 = Training   runtime
	0.74s	 = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -12.76s of remaining time.
	-30.2404	 = Validation score   (-root_mean_squared_error)
	0.36s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 613.17s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240429_111603")
In [31]:
predictor_new_features.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
                     model   score_val  pred_time_val    fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0      WeightedEnsemble_L3  -30.240410      22.095974  552.836447                0.001155           0.359319            3       True         14
1          CatBoost_BAG_L2  -30.515401      19.975629  467.097923                0.096180         102.799073            2       True         12
2          LightGBM_BAG_L2  -30.589689      20.138117  378.655426                0.258667          14.356575            2       True         10
3        LightGBMXT_BAG_L2  -31.151067      20.918398  383.922330                1.038949          19.623480            2       True          9
4     ExtraTreesMSE_BAG_L2  -31.485109      20.621979  380.036663                0.742529          15.737813            2       True         13
5   RandomForestMSE_BAG_L2  -31.674417      20.701023  415.698001                0.821574          51.399150            2       True         11
6      WeightedEnsemble_L2  -32.213694      19.040947  355.014108                0.000823           0.511402            2       True          8
7          LightGBM_BAG_L1  -33.917339       3.360040   30.952478                3.360040          30.952478            1       True          4
8          CatBoost_BAG_L1  -34.341995       0.121933  220.460180                0.121933         220.460180            1       True          6
9        LightGBMXT_BAG_L1  -34.453884      14.897718   82.090137               14.897718          82.090137            1       True          3
10    ExtraTreesMSE_BAG_L1  -38.107278       0.795041    9.756352                0.795041           9.756352            1       True          7
11  RandomForestMSE_BAG_L1  -38.424984       0.612650   20.955973                0.612650          20.955973            1       True          5
12   KNeighborsDist_BAG_L1  -84.125061       0.047784    0.043936                0.047784           0.043936            1       True          2
13   KNeighborsUnif_BAG_L1 -101.546199       0.044284    0.039792                0.044284           0.039792            1       True          1
Number of models trained: 14
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_KNN'}
Bagging used: True  (with 8 folds)
Multi-layer stack-ensembling used: True  (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', [])             : 2 | ['season', 'weather']
('float', [])                : 3 | ['temp', 'atemp', 'windspeed']
('int', [])                  : 4 | ['humidity', 'month', 'day', 'hour']
('int', ['bool'])            : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 3 | ['datetime', 'datetime.year', 'datetime.dayofweek']
*** End of fit() summary ***
/opt/conda/lib/python3.10/site-packages/autogluon/core/utils/plots.py:169: UserWarning: AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"
  warnings.warn('AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"')
Out[31]:
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
  'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
  'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
  'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
  'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
  'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
  'WeightedEnsemble_L2': 'WeightedEnsembleModel',
  'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
  'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
  'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
  'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
  'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
 'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
  'KNeighborsDist_BAG_L1': -84.12506123181602,
  'LightGBMXT_BAG_L1': -34.453884062670745,
  'LightGBM_BAG_L1': -33.91733862651761,
  'RandomForestMSE_BAG_L1': -38.424983594881716,
  'CatBoost_BAG_L1': -34.34199492944324,
  'ExtraTreesMSE_BAG_L1': -38.10727767243523,
  'WeightedEnsemble_L2': -32.2136936832968,
  'LightGBMXT_BAG_L2': -31.151066802192368,
  'LightGBM_BAG_L2': -30.589688521755814,
  'RandomForestMSE_BAG_L2': -31.674416659292678,
  'CatBoost_BAG_L2': -30.5154005834992,
  'ExtraTreesMSE_BAG_L2': -31.485108815338208,
  'WeightedEnsemble_L3': -30.24040961851195},
 'model_best': 'WeightedEnsemble_L3',
 'model_paths': {'KNeighborsUnif_BAG_L1': ['KNeighborsUnif_BAG_L1'],
  'KNeighborsDist_BAG_L1': ['KNeighborsDist_BAG_L1'],
  'LightGBMXT_BAG_L1': ['LightGBMXT_BAG_L1'],
  'LightGBM_BAG_L1': ['LightGBM_BAG_L1'],
  'RandomForestMSE_BAG_L1': ['RandomForestMSE_BAG_L1'],
  'CatBoost_BAG_L1': ['CatBoost_BAG_L1'],
  'ExtraTreesMSE_BAG_L1': ['ExtraTreesMSE_BAG_L1'],
  'WeightedEnsemble_L2': ['WeightedEnsemble_L2'],
  'LightGBMXT_BAG_L2': ['LightGBMXT_BAG_L2'],
  'LightGBM_BAG_L2': ['LightGBM_BAG_L2'],
  'RandomForestMSE_BAG_L2': ['RandomForestMSE_BAG_L2'],
  'CatBoost_BAG_L2': ['CatBoost_BAG_L2'],
  'ExtraTreesMSE_BAG_L2': ['ExtraTreesMSE_BAG_L2'],
  'WeightedEnsemble_L3': ['WeightedEnsemble_L3']},
 'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.03979229927062988,
  'KNeighborsDist_BAG_L1': 0.04393649101257324,
  'LightGBMXT_BAG_L1': 82.09013748168945,
  'LightGBM_BAG_L1': 30.952478408813477,
  'RandomForestMSE_BAG_L1': 20.955973148345947,
  'CatBoost_BAG_L1': 220.46018028259277,
  'ExtraTreesMSE_BAG_L1': 9.756352186203003,
  'WeightedEnsemble_L2': 0.511401891708374,
  'LightGBMXT_BAG_L2': 19.623480081558228,
  'LightGBM_BAG_L2': 14.35657525062561,
  'RandomForestMSE_BAG_L2': 51.399150371551514,
  'CatBoost_BAG_L2': 102.79907274246216,
  'ExtraTreesMSE_BAG_L2': 15.737812995910645,
  'WeightedEnsemble_L3': 0.35931873321533203},
 'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.04428362846374512,
  'KNeighborsDist_BAG_L1': 0.04778409004211426,
  'LightGBMXT_BAG_L1': 14.897717952728271,
  'LightGBM_BAG_L1': 3.360039710998535,
  'RandomForestMSE_BAG_L1': 0.6126501560211182,
  'CatBoost_BAG_L1': 0.1219325065612793,
  'ExtraTreesMSE_BAG_L1': 0.7950413227081299,
  'WeightedEnsemble_L2': 0.0008225440979003906,
  'LightGBMXT_BAG_L2': 1.0389490127563477,
  'LightGBM_BAG_L2': 0.2586674690246582,
  'RandomForestMSE_BAG_L2': 0.8215737342834473,
  'CatBoost_BAG_L2': 0.09617972373962402,
  'ExtraTreesMSE_BAG_L2': 0.7425293922424316,
  'WeightedEnsemble_L3': 0.0011551380157470703},
 'num_bag_folds': 8,
 'max_stack_level': 3,
 'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'KNeighborsDist_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'LightGBMXT_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'RandomForestMSE_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'CatBoost_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'WeightedEnsemble_L2': {'use_orig_features': False,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBMXT_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'RandomForestMSE_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'CatBoost_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True,
   'use_child_oof': True},
  'WeightedEnsemble_L3': {'use_orig_features': False,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True}},
 'leaderboard':                      model   score_val  pred_time_val    fit_time  \
 0      WeightedEnsemble_L3  -30.240410      22.095974  552.836447   
 1          CatBoost_BAG_L2  -30.515401      19.975629  467.097923   
 2          LightGBM_BAG_L2  -30.589689      20.138117  378.655426   
 3        LightGBMXT_BAG_L2  -31.151067      20.918398  383.922330   
 4     ExtraTreesMSE_BAG_L2  -31.485109      20.621979  380.036663   
 5   RandomForestMSE_BAG_L2  -31.674417      20.701023  415.698001   
 6      WeightedEnsemble_L2  -32.213694      19.040947  355.014108   
 7          LightGBM_BAG_L1  -33.917339       3.360040   30.952478   
 8          CatBoost_BAG_L1  -34.341995       0.121933  220.460180   
 9        LightGBMXT_BAG_L1  -34.453884      14.897718   82.090137   
 10    ExtraTreesMSE_BAG_L1  -38.107278       0.795041    9.756352   
 11  RandomForestMSE_BAG_L1  -38.424984       0.612650   20.955973   
 12   KNeighborsDist_BAG_L1  -84.125061       0.047784    0.043936   
 13   KNeighborsUnif_BAG_L1 -101.546199       0.044284    0.039792   
 
     pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  \
 0                 0.001155           0.359319            3       True   
 1                 0.096180         102.799073            2       True   
 2                 0.258667          14.356575            2       True   
 3                 1.038949          19.623480            2       True   
 4                 0.742529          15.737813            2       True   
 5                 0.821574          51.399150            2       True   
 6                 0.000823           0.511402            2       True   
 7                 3.360040          30.952478            1       True   
 8                 0.121933         220.460180            1       True   
 9                14.897718          82.090137            1       True   
 10                0.795041           9.756352            1       True   
 11                0.612650          20.955973            1       True   
 12                0.047784           0.043936            1       True   
 13                0.044284           0.039792            1       True   
 
     fit_order  
 0          14  
 1          12  
 2          10  
 3           9  
 4          13  
 5          11  
 6           8  
 7           4  
 8           6  
 9           3  
 10          7  
 11          5  
 12          2  
 13          1  }
In [33]:
predictions_new_features = predictor_new_features.predict(test)
predictions_new_features = {'datetime': test['datetime'], 'Pred_count': predictions_new_features}
predictions_new_features = pd.DataFrame(data=predictions_new_features)
predictions_new_features.head()
Out[33]:
datetime Pred_count
0 2011-01-20 00:00:00 15.958500
1 2011-01-20 01:00:00 11.255019
2 2011-01-20 02:00:00 10.674420
3 2011-01-20 03:00:00 9.340588
4 2011-01-20 04:00:00 7.769901
In [34]:
# Remember to set all negative values to zero
predictions_new_features[predictions_new_features['Pred_count']<0] = 0
In [35]:
predictions_new_features.describe()
Out[35]:
datetime Pred_count
count 6493 6493.000000
mean 2012-01-13 09:27:47.765285632 154.844589
min 2011-01-20 00:00:00 1.780506
25% 2011-07-22 15:00:00 53.331760
50% 2012-01-20 23:00:00 119.993599
75% 2012-07-20 17:00:00 220.551758
max 2012-12-31 23:00:00 816.064575
std NaN 133.741531
In [43]:
# Same submitting predictions
submission_new_features = pd.read_csv('submission.csv')
submission_new_features["count"] = predictions_new_features['Pred_count']
submission_new_features.to_csv("submission_new_features.csv", index=False)
In [44]:
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features.csv -m "new features"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 660kB/s]
Successfully submitted to Bike Sharing Demand
In [45]:
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName                     date                 description           status    publicScore  privateScore  
---------------------------  -------------------  --------------------  --------  -----------  ------------  
submission_new_features.csv  2024-04-29 11:42:11  new features          complete  0.68259      0.68259       
submission.csv               2024-04-29 11:12:02  first raw submission  complete  1.7998       1.7998        

New Score of 0.68259?¶

Step 6: Hyper parameter optimization¶

  • There are many options for hyper parameter optimization.
  • Options are to change the AutoGluon higher level parameters or the individual model hyperparameters.
  • The hyperparameters of the models themselves that are in AutoGluon. Those need the hyperparameter and hyperparameter_tune_kwargs arguments.
In [7]:
import autogluon.core as ag
from autogluon.common import space
from autogluon.tabular import TabularPredictor

nn_options = {  
    'dropout_prob': space.Real(0.0, 0.5, default=0.1),  # dropout probability 
}

gbm_options = {  
    'num_boost_round': 100,  # number of boosting rounds 
    'num_leaves': space.Int(lower=26, upper=66, default=36),  # number of leaves in trees
}

hyperparameters = {  # hyperparameters of each model type
                   'GBM': gbm_options,
                   'NN_TORCH': nn_options, 
                  }  

num_trials = 3  # try at most 3 different hyperparameter configurations for each type of model
search_strategy = 'auto'  # tune hyperparameters using Bayesian optimization routine with a local scheduler

hyperparameter_tune_kwargs = { 
    'num_trials': num_trials,
    'scheduler' : 'local',
    'searcher': search_strategy,
}

predictor_new_hpo = TabularPredictor(
    label="count", problem_type="regression", eval_metric="rmse"
    ).fit(
    train_data=train.drop(['casual', 'registered'], axis=1),
    time_limit=600,
    presets='best_quality', hyperparameters=hyperparameters, hyperparameter_tune_kwargs=hyperparameter_tune_kwargs)
No path specified. Models will be saved in: "AutogluonModels/ag-20240429_124008"
Presets specified: ['best_quality']
Warning: hyperparameter tuning is currently experimental and may cause the process to hang.
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
/opt/conda/lib/python3.10/site-packages/pkg_resources/__init__.py:2832: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20240429_124008"
AutoGluon Version:  0.8.2
Python Version:     3.10.14
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Sat Mar 23 09:49:55 UTC 2024
Disk Space Avail:   1.63 GB / 5.36 GB (30.4%)
	WARNING: Available disk space is low and there is a risk that AutoGluon will run out of disk during fit, causing an exception. 
	We recommend a minimum available disk space of 10 GB, and large datasets may require more.
Train Data Rows:    10886
Train Data Columns: 9
Label Column: count
Preprocessing data ...
/opt/conda/lib/python3.10/site-packages/autogluon/tabular/learner/default_learner.py:215: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context("mode.use_inf_as_na", True):  # treat None, NaN, INF, NINF as NA
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    2228.55 MB
	Train Data (Original)  Memory Usage: 1.52 MB (0.1% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fitting AsTypeFeatureGenerator...
			Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
	Stage 2 Generators:
		Fitting FillNaFeatureGenerator...
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
  X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
  X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
	Stage 3 Generators:
		Fitting IdentityFeatureGenerator...
		Fitting DatetimeFeatureGenerator...
	Stage 4 Generators:
		Fitting DropUniqueFeatureGenerator...
	Stage 5 Generators:
		Fitting DropDuplicatesFeatureGenerator...
	Types of features in original data (raw dtype, special dtypes):
		('float', [])                      : 3 | ['temp', 'atemp', 'windspeed']
		('int', [])                        : 5 | ['season', 'holiday', 'workingday', 'weather', 'humidity']
		('object', ['datetime_as_object']) : 1 | ['datetime']
	Types of features in processed data (raw dtype, special dtypes):
		('float', [])                : 3 | ['temp', 'atemp', 'windspeed']
		('int', [])                  : 3 | ['season', 'weather', 'humidity']
		('int', ['bool'])            : 2 | ['holiday', 'workingday']
		('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
	0.7s = Fit runtime
	9 features in original data used to generate 13 features in processed data.
	Train Data (Processed) Memory Usage: 0.98 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.8s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
	This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
	To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
	'GBM': {'num_boost_round': 100, 'num_leaves': Int: lower=26, upper=66},
	'NN_TORCH': {'dropout_prob': Real: lower=0.0, upper=0.5},
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 2 L1 models ...
Hyperparameter tuning model: LightGBM_BAG_L1 ... Tuning model for up to 179.71s of the 599.15s of remaining time.
  0%|          | 0/3 [00:00<?, ?it/s]
Will use sequential fold fitting strategy because import of ray failed. Reason: ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.6.3`
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
/opt/conda/lib/python3.10/site-packages/dask/dataframe/_pyarrow_compat.py:17: FutureWarning: Minimal version of pyarrow will soon be increased to 14.0.1. You are using 12.0.1. Please consider upgrading.
  warnings.warn(
/opt/conda/lib/python3.10/site-packages/dask/dataframe/__init__.py:31: FutureWarning: 
Dask dataframe query planning is disabled because dask-expr is not installed.

You can install it with `pip install dask[dataframe]` or `conda install dask`.
This will raise in a future version.

  warnings.warn(msg, FutureWarning)
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
Fitted model: LightGBM_BAG_L1/T1 ...
	-135.4732	 = Validation score   (-root_mean_squared_error)
	6.0s	 = Training   runtime
	0.0s	 = Validation runtime
Fitted model: LightGBM_BAG_L1/T2 ...
	-135.0295	 = Validation score   (-root_mean_squared_error)
	5.3s	 = Training   runtime
	0.0s	 = Validation runtime
Fitted model: LightGBM_BAG_L1/T3 ...
	-134.1941	 = Validation score   (-root_mean_squared_error)
	4.53s	 = Training   runtime
	0.0s	 = Validation runtime
Hyperparameter tuning model: NeuralNetTorch_BAG_L1 ... Tuning model for up to 179.71s of the 575.81s of remaining time.
Will use custom hpo logic because ray import failed. Reason: ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.6.3`
  0%|          | 0/3 [00:00<?, ?it/s]
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Ran out of time, stopping training early. (Stopping on epoch 35)
	Ran out of time, stopping training early. (Stopping on epoch 40)
	Ran out of time, stopping training early. (Stopping on epoch 47)
	Ran out of time, stopping training early. (Stopping on epoch 51)
	Ran out of time, stopping training early. (Stopping on epoch 56)
	Ran out of time, stopping training early. (Stopping on epoch 69)
	Stopping HPO to satisfy time limit...
Fitted model: NeuralNetTorch_BAG_L1/T1 ...
	-142.041	 = Validation score   (-root_mean_squared_error)
	161.57s	 = Training   runtime
	0.0s	 = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 414.12s of remaining time.
	-134.003	 = Validation score   (-root_mean_squared_error)
	1.05s	 = Training   runtime
	0.0s	 = Validation runtime
Fitting 2 L2 models ...
Hyperparameter tuning model: LightGBM_BAG_L2 ... Tuning model for up to 185.84s of the 412.93s of remaining time.
  0%|          | 0/3 [00:00<?, ?it/s]
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
Fitted model: LightGBM_BAG_L2/T1 ...
	-134.4063	 = Validation score   (-root_mean_squared_error)
	5.23s	 = Training   runtime
	0.0s	 = Validation runtime
Fitted model: LightGBM_BAG_L2/T2 ...
	-134.1484	 = Validation score   (-root_mean_squared_error)
	3.48s	 = Training   runtime
	0.0s	 = Validation runtime
Fitted model: LightGBM_BAG_L2/T3 ...
	-134.5661	 = Validation score   (-root_mean_squared_error)
	5.08s	 = Training   runtime
	0.0s	 = Validation runtime
Hyperparameter tuning model: NeuralNetTorch_BAG_L2 ... Tuning model for up to 185.84s of the 398.52s of remaining time.
  0%|          | 0/3 [00:00<?, ?it/s]
	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
	Stopping HPO to satisfy time limit...
Fitted model: NeuralNetTorch_BAG_L2/T1 ...
	-137.7589	 = Validation score   (-root_mean_squared_error)
	109.39s	 = Training   runtime
	0.0s	 = Validation runtime
Repeating k-fold bagging: 2/20
Fitting model: LightGBM_BAG_L2/T1 ... Training model for up to 289.07s of the 289.02s of remaining time.
	Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
	-134.001	 = Validation score   (-root_mean_squared_error)
	9.29s	 = Training   runtime
	0.07s	 = Validation runtime
Fitting model: LightGBM_BAG_L2/T2 ... Training model for up to 284.82s of the 284.77s of remaining time.
	Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
	-133.7499	 = Validation score   (-root_mean_squared_error)
	6.75s	 = Training   runtime
	0.09s	 = Validation runtime
Fitting model: LightGBM_BAG_L2/T3 ... Training model for up to 281.31s of the 281.25s of remaining time.
	Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
	-134.2248	 = Validation score   (-root_mean_squared_error)
	8.88s	 = Training   runtime
	0.08s	 = Validation runtime
Fitting model: NeuralNetTorch_BAG_L2/T1 ... Training model for up to 277.27s of the 277.23s of remaining time.
	Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
	-137.5566	 = Validation score   (-root_mean_squared_error)
	225.87s	 = Training   runtime
	0.21s	 = Validation runtime
Completed 2/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the 160.47s of remaining time.
	-133.5982	 = Validation score   (-root_mean_squared_error)
	0.68s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 440.32s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240429_124008")
In [8]:
predictor_new_hpo.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
                      model   score_val  pred_time_val    fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0       WeightedEnsemble_L3 -133.598151       0.378901  419.581011                0.001148           0.681137            3       True         10
1        LightGBM_BAG_L2/T2 -133.749929       0.087237  184.153805                0.086385           6.749786            2       True          7
2        LightGBM_BAG_L2/T1 -134.001034       0.070539  186.694348                0.069688           9.290329            2       True          6
3       WeightedEnsemble_L2 -134.002986       0.003735  172.454185                0.003287           1.053466            2       True          5
4        LightGBM_BAG_L1/T3 -134.194130       0.000139    4.532218                0.000139           4.532218            1       True          3
5        LightGBM_BAG_L2/T3 -134.224763       0.077487  186.282038                0.076636           8.878019            2       True          8
6        LightGBM_BAG_L1/T2 -135.029528       0.000109    5.302512                0.000109           5.302512            1       True          2
7        LightGBM_BAG_L1/T1 -135.473207       0.000403    6.003300                0.000403           6.003300            1       True          1
8  NeuralNetTorch_BAG_L2/T1 -137.556585       0.214733  403.272069                0.213881         225.868050            2       True          9
9  NeuralNetTorch_BAG_L1/T1 -142.041032       0.000201  161.565988                0.000201         161.565988            1       True          4
Number of models trained: 10
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_TabularNeuralNetTorch', 'StackerEnsembleModel_LGB'}
Bagging used: True  (with 8 folds)
Multi-layer stack-ensembling used: True  (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', [])                : 3 | ['temp', 'atemp', 'windspeed']
('int', [])                  : 3 | ['season', 'weather', 'humidity']
('int', ['bool'])            : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
*** End of fit() summary ***
/opt/conda/lib/python3.10/site-packages/autogluon/core/utils/plots.py:169: UserWarning: AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"
  warnings.warn('AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"')
Out[8]:
{'model_types': {'LightGBM_BAG_L1/T1': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L1/T2': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L1/T3': 'StackerEnsembleModel_LGB',
  'NeuralNetTorch_BAG_L1/T1': 'StackerEnsembleModel_TabularNeuralNetTorch',
  'WeightedEnsemble_L2': 'WeightedEnsembleModel',
  'LightGBM_BAG_L2/T1': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L2/T2': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L2/T3': 'StackerEnsembleModel_LGB',
  'NeuralNetTorch_BAG_L2/T1': 'StackerEnsembleModel_TabularNeuralNetTorch',
  'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
 'model_performance': {'LightGBM_BAG_L1/T1': -135.4732072756916,
  'LightGBM_BAG_L1/T2': -135.02952795945737,
  'LightGBM_BAG_L1/T3': -134.19413006667938,
  'NeuralNetTorch_BAG_L1/T1': -142.04103202606262,
  'WeightedEnsemble_L2': -134.00298642499382,
  'LightGBM_BAG_L2/T1': -134.00103350983295,
  'LightGBM_BAG_L2/T2': -133.74992855442642,
  'LightGBM_BAG_L2/T3': -134.22476301488,
  'NeuralNetTorch_BAG_L2/T1': -137.55658482834824,
  'WeightedEnsemble_L3': -133.59815093816746},
 'model_best': 'WeightedEnsemble_L3',
 'model_paths': {'LightGBM_BAG_L1/T1': ['LightGBM_BAG_L1', 'T1'],
  'LightGBM_BAG_L1/T2': ['LightGBM_BAG_L1', 'T2'],
  'LightGBM_BAG_L1/T3': ['LightGBM_BAG_L1', 'T3'],
  'NeuralNetTorch_BAG_L1/T1': ['NeuralNetTorch_BAG_L1', 'T1'],
  'WeightedEnsemble_L2': ['WeightedEnsemble_L2'],
  'LightGBM_BAG_L2/T1': ['LightGBM_BAG_L2', 'T1'],
  'LightGBM_BAG_L2/T2': ['LightGBM_BAG_L2', 'T2'],
  'LightGBM_BAG_L2/T3': ['LightGBM_BAG_L2', 'T3'],
  'NeuralNetTorch_BAG_L2/T1': ['NeuralNetTorch_BAG_L2', 'T1'],
  'WeightedEnsemble_L3': ['WeightedEnsemble_L3']},
 'model_fit_times': {'LightGBM_BAG_L1/T1': 6.003300428390503,
  'LightGBM_BAG_L1/T2': 5.3025124073028564,
  'LightGBM_BAG_L1/T3': 4.5322184562683105,
  'NeuralNetTorch_BAG_L1/T1': 161.56598782539368,
  'WeightedEnsemble_L2': 1.0534660816192627,
  'LightGBM_BAG_L2/T1': 9.290328741073608,
  'LightGBM_BAG_L2/T2': 6.749785900115967,
  'LightGBM_BAG_L2/T3': 8.878019332885742,
  'NeuralNetTorch_BAG_L2/T1': 225.8680498600006,
  'WeightedEnsemble_L3': 0.6811370849609375},
 'model_pred_times': {'LightGBM_BAG_L1/T1': 0.00040268898010253906,
  'LightGBM_BAG_L1/T2': 0.00010943412780761719,
  'LightGBM_BAG_L1/T3': 0.00013875961303710938,
  'NeuralNetTorch_BAG_L1/T1': 0.00020051002502441406,
  'WeightedEnsemble_L2': 0.003286600112915039,
  'LightGBM_BAG_L2/T1': 0.06968808174133301,
  'LightGBM_BAG_L2/T2': 0.08638525009155273,
  'LightGBM_BAG_L2/T3': 0.07663559913635254,
  'NeuralNetTorch_BAG_L2/T1': 0.213881254196167,
  'WeightedEnsemble_L3': 0.0011477470397949219},
 'num_bag_folds': 8,
 'max_stack_level': 3,
 'model_hyperparams': {'LightGBM_BAG_L1/T1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L1/T2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L1/T3': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'NeuralNetTorch_BAG_L1/T1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'WeightedEnsemble_L2': {'use_orig_features': False,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L2/T1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L2/T2': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'LightGBM_BAG_L2/T3': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'NeuralNetTorch_BAG_L2/T1': {'use_orig_features': True,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True},
  'WeightedEnsemble_L3': {'use_orig_features': False,
   'max_base_models': 25,
   'max_base_models_per_type': 5,
   'save_bag_folds': True}},
 'leaderboard':                       model   score_val  pred_time_val    fit_time  \
 0       WeightedEnsemble_L3 -133.598151       0.378901  419.581011   
 1        LightGBM_BAG_L2/T2 -133.749929       0.087237  184.153805   
 2        LightGBM_BAG_L2/T1 -134.001034       0.070539  186.694348   
 3       WeightedEnsemble_L2 -134.002986       0.003735  172.454185   
 4        LightGBM_BAG_L1/T3 -134.194130       0.000139    4.532218   
 5        LightGBM_BAG_L2/T3 -134.224763       0.077487  186.282038   
 6        LightGBM_BAG_L1/T2 -135.029528       0.000109    5.302512   
 7        LightGBM_BAG_L1/T1 -135.473207       0.000403    6.003300   
 8  NeuralNetTorch_BAG_L2/T1 -137.556585       0.214733  403.272069   
 9  NeuralNetTorch_BAG_L1/T1 -142.041032       0.000201  161.565988   
 
    pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  \
 0                0.001148           0.681137            3       True   
 1                0.086385           6.749786            2       True   
 2                0.069688           9.290329            2       True   
 3                0.003287           1.053466            2       True   
 4                0.000139           4.532218            1       True   
 5                0.076636           8.878019            2       True   
 6                0.000109           5.302512            1       True   
 7                0.000403           6.003300            1       True   
 8                0.213881         225.868050            2       True   
 9                0.000201         161.565988            1       True   
 
    fit_order  
 0         10  
 1          7  
 2          6  
 3          5  
 4          3  
 5          8  
 6          2  
 7          1  
 8          9  
 9          4  }
In [9]:
prediction_new_hpo = predictor_new_hpo.predict(test)
prediction_new_hpo = {'datetime': test['datetime'], 'Pred_count': prediction_new_hpo}
prediction_new_hpo = pd.DataFrame(data=prediction_new_hpo)
prediction_new_hpo.head()
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
  X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
Out[9]:
datetime Pred_count
0 2011-01-20 00:00:00 72.947769
1 2011-01-20 01:00:00 50.996433
2 2011-01-20 02:00:00 50.996471
3 2011-01-20 03:00:00 66.710915
4 2011-01-20 04:00:00 66.710983
In [10]:
# Remember to set all negative values to zero
prediction_new_hpo[prediction_new_hpo['Pred_count']<0] = 0
In [12]:
# Same submitting predictions
submission_new_hpo = pd.read_csv('submission.csv')
submission_new_hpo["count"] = prediction_new_hpo['Pred_count']
submission_new_hpo.to_csv("submission_new_hpo.csv", index=False)
In [16]:
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo.csv -m "new features with hyperparameters"
1179.07s - pydevd: Sending message related to process being replaced timed-out after 5 seconds
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 576kB/s]
Successfully submitted to Bike Sharing Demand
In [17]:
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
1188.97s - pydevd: Sending message related to process being replaced timed-out after 5 seconds
fileName                     date                 description                        status    publicScore  privateScore  
---------------------------  -------------------  ---------------------------------  --------  -----------  ------------  
submission_new_hpo.csv       2024-04-29 12:53:53  new features with hyperparameters  complete  1.29634      1.29634       
submission_new_features.csv  2024-04-29 11:42:11  new features                       complete  0.68259      0.68259       
submission.csv               2024-04-29 11:12:02  first raw submission               complete  1.7998       1.7998        

New Score of 1.29634¶

Step 7: Write a Report¶

Refer to the markdown file for the full report¶

Creating plots and table for report¶

In [ ]:
# Taking the top model score from each training run and creating a line plot to show improvement
# You can create these in the notebook and save them to PNG or use some other tool (e.g. google sheets, excel)
fig = pd.DataFrame(
    {
        "model": ["initial", "add_features", "hpo"],
        "score": [?, ?, ?]
    }
).plot(x="model", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_train_score.png')
In [19]:
# Take the 3 kaggle scores and creating a line plot to show improvement
fig = pd.DataFrame(
    {
        "test_eval": ["initial", "add_features", "hpo"],
        "score": [1.7998, 0.68259, 1.29634]
    }
).plot(x="test_eval", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_test_score.png')
No description has been provided for this image

Hyperparameter table¶

In [20]:
# The 3 hyperparameters we tuned with the kaggle score as the result
hyperparams_df = pd.DataFrame({
    "model": ["initial_model", "add_features_model", "hpo_model"],
    "hpo1": ['default_vals', 'default_vals', 'GBM: num_leaves: lower=26, upper=66'],
    "hpo2": ['default_vals', 'default_vals', 'NN_TORCH: dropout_prob: 0.0, 0.5'],
    "hpo3": ['default_vals', 'default_vals', 'GBM: num_boost_round: 100'],
    "score": [1.7998, 0.68259, 1.29634]
})
In [21]:
hyperparams_df.head()
Out[21]:
model hpo1 hpo2 hpo3 score
0 initial_model default_vals default_vals default_vals 1.79980
1 add_features_model default_vals default_vals default_vals 0.68259
2 hpo_model GBM: num_leaves: lower=26, upper=66 NN_TORCH: dropout_prob: 0.0, 0.5 GBM: num_boost_round: 100 1.29634
In [22]:
def plot_series(time, series, format="-", start=0, end=None, label=None):
    plt.plot(time[start:end], series[start:end], format, label=label)
    plt.xlabel("Time")
    plt.ylabel("Value")
    if label:
        plt.legend(fontsize=14)
    plt.grid(True)
In [23]:
sub_new = pd.read_csv('submission_new_features.csv')
In [ ]:
import matplotlib.pyplot as plt
series = train["count"].to_numpy()
time = train["datetime"].to_numpy()


plt.figure(figsize=(350, 15))
plot_series(time, series)
plt.title("Train Data time series graph")
#plot_series(time1, series1)
plt.show()
No description has been provided for this image
In [25]:
sub_new.loc[:, "datetime"] = pd.to_datetime(sub_new.loc[:, "datetime"])

series1 = sub_new["count"].to_numpy()
time1 = sub_new["datetime"].to_numpy()

plt.figure(figsize=(350, 15))
#plot_series(time, series)
plot_series(time1, series1)
plt.title("Test Data time series graph")
plt.show()
No description has been provided for this image
In [27]:
!jupyter nbconvert --to html project_notebook.ipynb
2269.41s - pydevd: Sending message related to process being replaced timed-out after 5 seconds
[NbConvertApp] WARNING | pattern 'project_notebook.ipynb' matched no files
This application is used to convert notebook files (*.ipynb)
        to various other formats.

        WARNING: THE COMMANDLINE INTERFACE MAY CHANGE IN FUTURE RELEASES.

Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
    <cmd> --help-all

--debug
    set log level to logging.DEBUG (maximize logging output)
    Equivalent to: [--Application.log_level=10]
--show-config
    Show the application's configuration (human-readable format)
    Equivalent to: [--Application.show_config=True]
--show-config-json
    Show the application's configuration (json format)
    Equivalent to: [--Application.show_config_json=True]
--generate-config
    generate default config file
    Equivalent to: [--JupyterApp.generate_config=True]
-y
    Answer yes to any questions instead of prompting.
    Equivalent to: [--JupyterApp.answer_yes=True]
--execute
    Execute the notebook prior to export.
    Equivalent to: [--ExecutePreprocessor.enabled=True]
--allow-errors
    Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
    Equivalent to: [--ExecutePreprocessor.allow_errors=True]
--stdin
    read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
    Equivalent to: [--NbConvertApp.from_stdin=True]
--stdout
    Write notebook output to stdout instead of files.
    Equivalent to: [--NbConvertApp.writer_class=StdoutWriter]
--inplace
    Run nbconvert in place, overwriting the existing notebook (only
            relevant when converting to notebook format)
    Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory=]
--clear-output
    Clear output of current file and save in place,
            overwriting the existing notebook.
    Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --ClearOutputPreprocessor.enabled=True]
--coalesce-streams
    Coalesce consecutive stdout and stderr outputs into one stream (within each cell).
    Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --CoalesceStreamsPreprocessor.enabled=True]
--no-prompt
    Exclude input and output prompts from converted document.
    Equivalent to: [--TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True]
--no-input
    Exclude input cells and output prompts from converted document.
            This mode is ideal for generating code-free reports.
    Equivalent to: [--TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_input=True --TemplateExporter.exclude_input_prompt=True]
--allow-chromium-download
    Whether to allow downloading chromium if no suitable version is found on the system.
    Equivalent to: [--WebPDFExporter.allow_chromium_download=True]
--disable-chromium-sandbox
    Disable chromium security sandbox when converting to PDF..
    Equivalent to: [--WebPDFExporter.disable_sandbox=True]
--show-input
    Shows code input. This flag is only useful for dejavu users.
    Equivalent to: [--TemplateExporter.exclude_input=False]
--embed-images
    Embed the images as base64 dataurls in the output. This flag is only useful for the HTML/WebPDF/Slides exports.
    Equivalent to: [--HTMLExporter.embed_images=True]
--sanitize-html
    Whether the HTML in Markdown cells and cell outputs should be sanitized..
    Equivalent to: [--HTMLExporter.sanitize_html=True]
--log-level=<Enum>
    Set the log level by value or name.
    Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
    Default: 30
    Equivalent to: [--Application.log_level]
--config=<Unicode>
    Full path of a config file.
    Default: ''
    Equivalent to: [--JupyterApp.config_file]
--to=<Unicode>
    The export format to be used, either one of the built-in formats
            ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'qtpdf', 'qtpng', 'rst', 'script', 'slides', 'webpdf']
            or a dotted object name that represents the import path for an
            ``Exporter`` class
    Default: ''
    Equivalent to: [--NbConvertApp.export_format]
--template=<Unicode>
    Name of the template to use
    Default: ''
    Equivalent to: [--TemplateExporter.template_name]
--template-file=<Unicode>
    Name of the template file to use
    Default: None
    Equivalent to: [--TemplateExporter.template_file]
--theme=<Unicode>
    Template specific theme(e.g. the name of a JupyterLab CSS theme distributed
    as prebuilt extension for the lab template)
    Default: 'light'
    Equivalent to: [--HTMLExporter.theme]
--sanitize_html=<Bool>
    Whether the HTML in Markdown cells and cell outputs should be sanitized.This
    should be set to True by nbviewer or similar tools.
    Default: False
    Equivalent to: [--HTMLExporter.sanitize_html]
--writer=<DottedObjectName>
    Writer class used to write the
                                        results of the conversion
    Default: 'FilesWriter'
    Equivalent to: [--NbConvertApp.writer_class]
--post=<DottedOrNone>
    PostProcessor class used to write the
                                        results of the conversion
    Default: ''
    Equivalent to: [--NbConvertApp.postprocessor_class]
--output=<Unicode>
    Overwrite base name use for output files.
                Supports pattern replacements '{notebook_name}'.
    Default: '{notebook_name}'
    Equivalent to: [--NbConvertApp.output_base]
--output-dir=<Unicode>
    Directory to write output(s) to. Defaults
                                  to output to the directory of each notebook. To recover
                                  previous default behaviour (outputting to the current
                                  working directory) use . as the flag value.
    Default: ''
    Equivalent to: [--FilesWriter.build_directory]
--reveal-prefix=<Unicode>
    The URL prefix for reveal.js (version 3.x).
            This defaults to the reveal CDN, but can be any url pointing to a copy
            of reveal.js.
            For speaker notes to work, this must be a relative path to a local
            copy of reveal.js: e.g., "reveal.js".
            If a relative path is given, it must be a subdirectory of the
            current directory (from which the server is run).
            See the usage documentation
            (https://nbconvert.readthedocs.io/en/latest/usage.html#reveal-js-html-slideshow)
            for more details.
    Default: ''
    Equivalent to: [--SlidesExporter.reveal_url_prefix]
--nbformat=<Enum>
    The nbformat version to write.
            Use this to downgrade notebooks.
    Choices: any of [1, 2, 3, 4]
    Default: 4
    Equivalent to: [--NotebookExporter.nbformat_version]

Examples
--------

    The simplest way to use nbconvert is

            > jupyter nbconvert mynotebook.ipynb --to html

            Options include ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'qtpdf', 'qtpng', 'rst', 'script', 'slides', 'webpdf'].

            > jupyter nbconvert --to latex mynotebook.ipynb

            Both HTML and LaTeX support multiple output templates. LaTeX includes
            'base', 'article' and 'report'.  HTML includes 'basic', 'lab' and
            'classic'. You can specify the flavor of the format used.

            > jupyter nbconvert --to html --template lab mynotebook.ipynb

            You can also pipe the output to stdout, rather than a file

            > jupyter nbconvert mynotebook.ipynb --stdout

            PDF is generated via latex

            > jupyter nbconvert mynotebook.ipynb --to pdf

            You can get (and serve) a Reveal.js-powered slideshow

            > jupyter nbconvert myslides.ipynb --to slides --post serve

            Multiple notebooks can be given at the command line in a couple of
            different ways:

            > jupyter nbconvert notebook*.ipynb
            > jupyter nbconvert notebook1.ipynb notebook2.ipynb

            or you can specify the notebooks list in a config file, containing::

                c.NbConvertApp.notebooks = ["my_notebook.ipynb"]

            > jupyter nbconvert --config mycfg.py

To see all available configurables, use `--help-all`.

In [28]:
cd ~/documents
[Errno 2] No such file or directory: '/home/sagemaker-user/documents'
/home/sagemaker-user/cd0385-project-starter
In [ ]: